Applications of coherent classical communication and the 
Schur transform to quantum information theory 

by 

Aram Wettroth Harrow 

B.S., Massachusetts Institute of Technology (2001) 

Submitted to the Department of Physics 
in partial fulfillment of the requirements for the degree of 

Doctor of Philosophy in Physics 

at the 

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 

September 2005 

© Aram Wettroth Harrow, MMV. All rights reserved. 

The author hereby grants to MIT permission to reproduce and distribute publicly 
paper and electrònic copies of this thesis document in whole or in part. 



Author 

Department of Physics 
August 4, 2005 



Certified by 

Isaac L. Chuang 

Associate Professor of Electrical Engineering and Computer Science, and Physics 

Thesis Supervisor 



Accepted by 



Thomas J. Greytak 
Professor of Physics 



2 



Applications of coherent classical communication and the Schur transform 

to quantum information theory 

by 

Aram Wettroth Harrow 

Submittcd to thc Department of Physics 
on August 4, 2005, in partial mlfillmcnt of the 
requirements for the degree of 
Doctor of Philosophy in Physics 

Abstract 

Quantum mechanics has led not only to new physical theories, but also a new understanding of 
information and computation. Quantum information not only yields new methods for achieving 
classical tasks such as factoring and key distribution but also suggests a completely new set of quantum 
problcms, such as sending quantum information over quantum channels or emciently performing 
particular basis changes on a quantum computer. This thesis contributes two new, purely quantum, 
tools to quantum information theory — coherent classical communication in the first half and an 
efRcicnt quantum circuit for the Schur transform in the second half. 

The first part of this thesis (Chapters 1-4) is in fact built around two loosely overlapping themes. 
One is quantum Shannon theory, a broad class of coding theorems that includes Shannon and Schu- 
macher data compression, channcl coding, cntanglcment distillation and many others. The second, 
more specific, theme is the concept of using unitary quantum interactions to communicate between 
two parties. We begin by presenting new formalism: a general framework for Shannon theory that 
describes communication tasks in terms of fundamental information processing resources, such as 
cntanglement and classical communication. Thcn we discuss communication with unitary gates and 
introduce the concept of coherent classical communication, in which classical messages are sent via 
some nearly unitary process. We find that coherent classical communication can be used to derivo 
several new quantum protocols and unify them both conceptually and opcrationally with old ones. 
Finally, we usc thcse new protocols to prove optimal trade-off curves for a wide variety of coding prob- 
lems in which a noisy channel or state is consumcd and two noisclcss resources are either consumed 
or generated at some rate. 

The second half of the thesis (Chapters 5-8) is based on the Schur transform, which maps between 
the computational basis of (C d )®" and a basis (known as the Schur basis) which simultaneously 
diagonalizes the commuting actions of the symmetric group S n and the unitary group Lid- The Schur 
transform is used as a subroutinc in many quantum communication protocols (which we review and 
further develop), but previously no polynomial-time quantum circuit for the Schur transform was 
known. We give such a polynomial-time quantum circuit based on the Clebsch-Gordan transform and 
then give algorithmic connections between the Schur transform and the quantum Fourier transform 
on S n . 

Thesis Supervisor: Isaac L. Chuang 

Title: Associate Professor of Electrical Engineering and Computer Science, and Physics 
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Chapter 

Introduction 



0.1 Motivation and context 

Classical theories of information and computation: Though it may seem like a recent phonomeiion, 
computation — the manipulation, storage and transmission of information — has long been one of the 
most central fcaturcs of human civilization. Markets of buycrs and sellcrs pcrform distributcd compu- 
tations to optimize the allocation of scarce resources, natural languages carefully balancc the goals of 
reducing rcdundancy while correcting errors, and legal systems have long sought reliablc algorithms of 
justice that are accurate and efhcient even when implemented with unrcliablc components. Although 
these examples cannot be totally separated from human intelligence, they all rely on an impersonal 
notion of information that has two crucial attributes. First, information can be abstracted away 
from any particular physical realization; it can be photocopied, memorized, dictated, transcribed 
and broadcast, always in principio largely preserving the original mcaning. Likcwise an abstract 
algorithm for proccssing information can be pcrformed equivalcntly using pcncil and paper or with 
digital circuits, as long as it is purcly mcchanical and makes no use of human insight or creativity. 
Though the particular features and efficiency of each model of computation may differ, the class of 
problcms they can solve is the same.* 

These ideas of computation and information were expressed in their modern forms by Turing 
and Church in 1936[Tur36, Chu36] and Shannon in 1948[Sha48], respectively. Turing described a 
hypothetical machinc mcant to be able to perform any purely mcchanical computation, and indced 
every method of computation so far dcvised can be simulatcd by a Turing machine. Moreover, most 
practical algorithms used today correspond to the class of problcms that a Turing machinc can solvc 
given a random numbcr gencrator and running time bounded by a polynomial of the input size. 
While Turing showed the fungibility of computation, Shannon proved that information is fungible, 
so that determining whether any source can be reliably transmitted by any channel reduces, in the 
limit of long strings, to calculating only two numbers: the information content of the source and the 
information capacity of the channel. 

The abstract theories of Turing and Shannon have been extraordinarily succcssful because they 
have happencd to match the information-processing technology within our reach in the 20 th century; 
Shannon capacitics are nearly achievable by practical codes and most polynomial time algorithms 
are feasible on modern computers. However, our knowledge of quantum mechanics is now forcing us 
to rethink our ideas of information and computation, just as relativity revised our notions of space 
and time. The state of a quantum mechanical system has a number of properties which cannot be 
reduced to the former, classical, notion of information. 

The challenge from quantum mechanics: The bàsic principies of quantum mechanics are simple 
to state mathematically, but hard to understand in terms we are familiar with from classical theories 

*For two very different perspectives on these ideas, see Cybernetics (1948) by N. Weiner and The Postmodern 
Condition (1979) by J.-F. Lyotard. 



11 



12 



0.1. MOTIVATION AND CONTEXT 



of physics and information. A quantum systcm with d levels (e.g. an electron in the p orbital of an 
atom, which can be in the p x , p y or p z states) has a state described by a unit vector \ip) that bclongs 
to a (f-dimensional complex vector spacc. Thus, an electron could be in the p x or p y state, or in 
a linear combination of the two, known in chemistry as a hybrid orbital, or in quantum mechanics 
as a superposition. Systems combine via the tensor product, so the combined state space of n d- 
level systems is c? n -dimensional. A measurement with K outeomes is given by a colleetion of matrices 
{Mi, . . . , Mx} such that outeome k has probability (ip\MlMk\ip) (here is the Hermitian conjugate 



of \if>)) and results in the normalized output state Mk\ip)/y {^M^M^ip) ] any measurement is possible 



(on a finite-dimensional system) as long as it satisfies the normalization condition _ T MlM^ = 1. 
The possible forms of time evolution are entirely described by the constraints of normalization and 
linearity; they correspond to maps from \tp) to U\tp), where U is a unitary operator (UW = 1). 

These principies bear a number of resemblances to classical wave mechanics, and at face value may 
not appear particularly striking. However, they have dramàtic implications when quantum systems 
are used to store and manipulate information. 

• Exponentially long descriptions: While n copies of a classical system require 0(n) bits to de- 
scribe, n copies of a comparable quantum system cannot be accurately described with fewer than 
cxp(0(n)) bits. This is a direct consequence of the tensor product structure of composite quan- 
tum systems, in which n two-level systems are described by a unit vector in a 2 ra -dimensional 
complex vector space. On the other hand, the largest classical message that can be reliably 
encoded in such a system is n bits long[Hol73]. This cnormous gap cannot be explained by any 
classical model of information, even when probabilistic or analog models are considered. 

• Nonlocal state descriptions: Anothcr consequence of applying the tensor product to state spaces 
is that a composite system AB can be in an entangled state that cannot be separated into a 
state of system A and a state of system B. While correlated probability distributions have a 
similar property, an entangled quantum system differs in that the system as a whole can be 
in a definite state, while its parts still exhibit (correlated) randomness. Morcover, measuring 
entangled states yields correlations that cannot be obtained from any classical correlated random 
variable [Per93], though they nevertheless do not permit instantaneous communication between 
A and B. 

• Reversible unitary evolution: Since time evolution is unitary, it is always reversible. (Measure- 
ment is also reversible once we include the measuring apparatus; sec [Pcr93] or Section 1.1 of this 
thesis for dctails.) As an immediate consequence, quantum information can never be deleted, 
only rearranged, perhaps into a less accessible form. An only slightly more complicated argu- 
ment can prové that it is impossible to copy an arbitrary quantum state [WZ82], unlcss wc know 
that the state belongs to a finite set that is perfectly distinguishable by some measurement. 

This contrasts sharply with one of classical information's defining properties, its infinite repro- 
ducibility. The idea of possessing information takes on an entirely new meaning when refer- 
ring to quantum information, one that we are only barely beginning to appreciate (e.g. see 



• Complementary observables: Another way to prové that quantum information cannot be cloned 
is via the uncertainty principle, which holds that complementary observables, such as position 
and momentum, cannot be simultaneously measured; observing one necessarily randomizes the 
other. The reason this implies no-cloning is that making a perfect copy of a particlc would 
allow the position of one and the momentum of the other to be measured, thereby inferring 
both quantities about the original system. 

Even though the uncertainty principle describes limitations of quantum information, quantum 
cryptography turns this into a strength of quantum communication, by using uncertainty to 
hide information from an eavesdropper. The idea is to encode a random bit in one of two 




[Prc99, GC01]). 
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randomly choscn complcmcntary observables, so that without knowing how the bit is encoded, 
it is impossible to measure it without risking disturbance. This can detect any eavesdropper, 
no matter how sophisticated, and even if the quantum information is sent through complctcly 
insecure channels. Combining this process with públic classical communication can be used to 
send unconditionally secure messages[BB84]. 

• Interference of amplitudes: In the two-slit experiment, two beams of light from point sources 
(such as slits cut into a screen) overlap on a screen, but instead of simply adding, yield alter- 
nating bands of constructive and destructive interference. One insight of quantum mechanics 
is that partides are waves with complex amplitudes, so that interference is still found in the 
two-slit experiment with single photons, electrons, or even molècules. Measurement breaks 
the quantum coherence which makes this possible, so observing which slit an electron passes 
through, no matter how gently this is performed, complctcly eliminates the interference cffect. 

The powcr of interference would be dramatically demonstratcd by building a large-sealc quan- 
tum computer and using it to solve classical problems. Such a computer could interfere different 
branches of a computation in much the same way that different paths of an electron can interfere. 

These examples are significant not only because they expand the range of what is efficiently 
computable, but because they force us to revise the logical terms with which we understand the 
world around us. We can no longer say that an electron either went through one path or the other, 
or that a quantum computer took a particular computational path or that Schòdingcr's cat must be 
either alivc or dead. At one point, this suggested that quantum theory needed to revised, but now a 
consensus is emerging that it is instead classical lògic that needs to be rcthought. 

The operational approach to quantum information: Unfortunately, ever since quantum mechanics 
was first articulated seventy years ago, it has becn difficult to givc a clcar philosophical interepretation 
of quantum information. In the last 10-20 years, though, a good deal of progress has been made by 
thinking about quantum information operationally, and studying how information-processing tasks 
can be accomplished using quantum systems. At the same time, we would like to study quantum 
information in its own right, preferably by abstracting it away from any particular physical realization. 

This operational-yct-abstract approach to quantum information is best rcalized by the idea of 
quantum computation. While classical computers are based on bits, which can be either or 1, 
quantum computers operate on quantum bits or qubits, which are 2-level quantum systems. Each 
state of a quantum memory register (a collection of n qubits, hence with 2™ states) has its own complex 
amplitude. Performing an elementary quantum gate corresponds to multiplying this (length 2") vector 
of amplitudes by a unitary matrix of size 2™ x 2™. If we prepare an input with nonzero amplitude 
in many different states, we can run a computation in superposition on all of these input states and 
thcn interfere thcir output amplitudes, just as the amplitudes of differcnts paths of an electron can 
interfere. Certain problems appear to lend thcmselvcs well to this approach, and allow us to observe 
constructive interference in "correct" branches of the computation and destructive interference in 
"incorrect" branches; ncedlcss to say, this techniquc is complctcly impossible on classical probabilistic 
computers. For example, Shor's algorithm[Sho94] is able to use interference to factor integers on a 
quantum computer much faster than the best known classical algorithm can. 

Other applications use the fact that amplitudes can add linearly, while probability (or intensity) 
is proportional to amplitude squared. This is used in Grover's algorithm [Gro96] to search a databasc 
of TV items with time 0(y/~N), or in the more colorful application of "interaction-free measurement," 
which can safely detect a bomb that will explode if it absorbs a single photon. Here the idea is to 
constructively interfere TV photons, each of amplitude 1/TV, while randomizing the phase that the 
bomb sees, so that the bomb experiences a total intensity of TV • (1/TV) 2 = 1/TV, which can be made 
arbitrarily small (see [RG02] and references therein). 

Purely quantum problems in quantum information: So far all of the examples of the power of 
quantum information describe goals that arc defincd cntircly in terms of classical information (sharing 
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secret bits, unstructured search, factoring integers) but are more efficiently achieved using quantum 
information processing resources; we might call thesc hybrid classical-quantum problcms. 

As our undcrstanding of quantum information has improved, we have also begun to study in- 
formation processing tasks which are purely quantum; for example, we might ask at what rate a 
noisy quantum channel can reliably transmit quantum messages. In fact, it is even possible to think 
of classical information entirely as a special case of quantum information, a philosophy known as 
thc "Church of thc Largcr Hilbcrt Space"* which Section 1.1 will explain in detail. The two main 
contributions of this thesis involve such "purely quantum" tasks, in which both the problem and the 
solution are given in terms of quantum information. Before cxplaining them, we will discuss thc fields 
of research that give them context. 

Quantum information theory (or more specifically, quantum Shannon theory) seeks a quantita- 
tive understanding of how various quantum and classical communication resources, such as 
noisy channels or shared correlation, can be used to simulate other communication resources. 
The challenge comes both from the much richer structure of quantum channels and states, and 
from the larger number of communication resources that we can consider; for example, chan- 
nels can be classical or quantum or can vary continuously between these possibilitics. Moreover, 
(quantum) Shannon theory studies asymptotic capacities; we might ask that n uses of channel 
send n(C — S n ) bits with error e„, where 8 n ,e n — > as n — > oo. Since the state of n quan- 
tum systems generally requires cxp(0(n)) bits to describe, the set of possible communication 
strategics grows quite rapidly as the number of channel uses increases. 

Whilc carly work (such as [Hol73, BB84]) focuscd on using quantum channels to transmit clas- 
sical messages, the last ten years have seen a good deal of work on thc task of sending quantum 
information for its own sake, or as part of a quantum computer. The main contribution of the 
first half of this thesis is to show that many tasks prcviously thought of in hybrid classical- 
quantum terms (such as using entanglement to help a noisy quantum channel send classical 
bits) are better thought of as purely quantum communication tasks. We will introduce a new 
tool, called coherent classical communication, to systematize this intuition. Coherent classical 
communication is actually a purely quantum communication resource; the name indicates that 
it is obtained by modifying protocols that use classical communication so that they preserve 
quantum coherence between different messages. We will find that coherent classical communi- 
cation, together with a rigorous theory of quantum information resources, will give quick proofs 
of a wide array of optimal quantum communication protocols, including several that have not 
been seen before. 

Quantum complexity theory asks how long it takes quantum computers to solve various prob- 
lems. Since quantum algorithms include classical algorithms as a special case, the interesting 
question is when quantum algorithms can perform a task faster than the best possible or best 
known classical algorithm. Thc ultimate goal here is generally to solve classical problems (fac- 
toring, etc.) and the question is the amount of classical or quantum resources required to do 
so. 

When considering instead "purely quantum" algorithms, with quantum inputs and quantum 
outputs, it is not immediately apparent what application these algorithms have. However, at 
the heart of Shor's factoring algorithm, and indeed almost all of the other known or suspected 
exponential speedups, is the quantum Fourier transformi a procedure that maps quantum input 
f( x )\ x ) t° quantum output J2 X f( x )\ x )i where / is the Fourier transform of the function /. 
Such a procedure, which Fourier transforms the amplitudcs of a wavefunction rather than an 
array of floating point numbers, would not even make sense on a classical computer; complex 
probabilities do not exist and global properties of a probability distribution (such as periodicity) 
cannot be accessed by a single sample. Likewise, Grover's search algorithm can be thought of 



*This term is due to John Smolin. 
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as an application of quantum walks[Szc04], a vcrsatile quantum subroutine that is not only 
fastcr than classical random walks, but again performs a task that would not be well-dcfincd 
in terms of classical probabilitics. These quantum subroutincs rcprcscnt the core of quantum 
speedups, as well as the place where our classical intuition about algorithms as logical procedurcs 
breaks down. Thus, frnding new nontrivial purely quantum algorithms is likcly to be the key 
to understanding exactly how quantum computing is more powerful than the classical model. 

The second half of this thesis is based on the Schur transform, a purely quantum algorithm 
which, like the quantum Fouricr transform, changes from a local tensor power basis to a basis 
that refleets the global properties of the systcm. Whilc the Fourier transform involves the eyelic 
group (which acts on an n-bit number by addition), the Schur transform is instead based on the 
symmctric and unitary groups, which act on n d-dimensional quantum systems by permuting 
them and by collcctivcly rotating them. The primary contribution of this thesis will be an cffi- 
cient quantum circuit implcmcnting the Schur transform. As a purely quantum algorithm, the 
Schur transform does not directly solve any classical problem. However, it is a crucial subrou- 
tine for many tasks in quantum information thcory, which can now be cfncicntly implemented 
on a quantum computer using our methods. More intriguingly, an efficient implementation of 
the Schur transform raises the hope of frnding new types of quantum speedups. 

This section has tried to give a flavor of why quantum information is an interesting subject, 
and of the sort of problems that this thesis contributes to. In the next section, we will set out the 
contributions of this thesis more precisely with a detailed technical summary. 

0.2 Summary of results 

This thesis is divided into two halves: Chapters 1-4 discuss information theory and Chapters 5-8 are 
on the Schur transform. The first chapter of each half is mostly background and the other chapters 
are mostly new work, though some exceptions to this rule will be indicated. A diagram of how the 
chapters depend on one another is given in Fig. 0-1. 
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Figurc 0-1: Dependències between different chapters of this thesis. The sòlid lines indicate that 
one chapter depends on another, while the dashed lines mean a partial dependence: Section 6.3 has 
references to some of the protocols in Section 1.4 and Chapter 3 is motivated by and extends the 
results of Chapter 2. 



Chapter 1 introduces a rigorous framework for concisely stating coding theorems in quantum Shan- 
non theory. The key idea, which has long been tacitly understood but not spcllcd out cxplicitly, 
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is that communication protocols in quantum information thcory can be thought of as inequal- 
ities bctwccn asymptotic information proccssing resources. Chaimcl coding, for example, says 
that a noisy channel is at least as useful for communication as the use of a noiseless channel at 
a particular rate. This chaptcr rigorously dcfincs and proves the sort of claims we would likc to 
take for granted (e.g., that resources inequalities are transitive) in Section 1.2, goes on to prové 
some more advanced properties of resource inequalities in Section 1.3 and then summarizes 
many of the kcy rcsults of quantum Shannon thcory in tcrms of this new formalism in Sec- 
tion 1.4. Chapter 1 also lays out various definitions and notation used in the rest of the thesis, 
and in particular gives a detailed description of how the various purifications we use make up 
the Church of the Larger Hilbert Space (in Section 1.1). This chapter, as well as Chapter 4, is 
bascd on joint work with Igor Devetak and Andreas Winter, which is in the process of bcing 
turned into a papcr[DHW05]. 

Chapter 2 applics this resource formalism to the problcm of communication using a unitary gate 
that couples two parties. Unitary gates are in some ways more complicatcd than one-way quan- 
tum channels because they arc intrinsically bidirectional, but in other ways they are simplcr 
because they do not interact with the environment. The main results of this chapter are capac- 
ity formulae for entanglcment creation and one-way classical communication using unlimitcd 
cntanglement, as well as several relations among these and other capacities. We will see that 
most of these results are superseded by those in the next chapter; the capacity formulae will 
be simultaneously generalized while the relations between capacities will be explained in terms 
of a deeper principio. Howcver this chapter hclps provide motivation. as well as bàsic tools, 
for the results that follow. It is based on [BHLS03] (joint work with Charles Bennett, Debbie 
Lcung and John Smolin) , though the original manuscript has been rewritten in order to use the 
resource formalism of Chapter 1 (which has grcatly simplificd both definitions and proofs) and 
to add new material. 

Chapter 3 introduces the concept of coherent classical communication, a new communication prim- 
itive that can be thought of either as classical communication sent through a unitary channel, 
or as classical communication in which the sender gets the part of the output that normally 
would go to the environment. This provides an efficient (and in fact, usually optimal) link from 
a wide variety of classical-quantum protocols (teleportation, super-dense coding, remote state 
preparation, HSW coding, classical capacities of unitary gates, and more in the next chapter) 
to purely quantum protocols that often would be much more difEcult to prové by other mcans 
(super-dense coding of quantum states, quantum capacities of unitary gates, etc). 

This chapter describes some of the general properties of coherent communication, showing how 
it is equivalent to Standard resources and proving conditions under which classical-quantum 
protocols can be made coherent. After describing how the examples in the last paragraph can 
all be fruitfully made coherent, we apply these rcsults to find the tradeoff between the rates 
of classical communication and entanglement generation/consumption possible per use of a 
unitary gate. 

Most of the material in this chapter is based on [Har04], with a few important exceptions. The 
carcful proofs of the converse of Theorem 3.7 (which showed that unlimitcd back communi- 
cation does not improve unitary gate capacities for forward communication or cntanglement 
generation) and of coherent remote state preparation are new to the thesis. The full bidirec- 
tional version of Theorem 3.1 (showing that sending classical communication through unitary 
channels is as strong as coherent classical communication) and the discussion of bidirectional 
rate regions in Section 3.4.3 are both from [HL05], which was joint work with Debbie Leung. 
Finally, the formal rules for when classical communication can be made coherent were sketched 
in [DHW04] and will appear in the present form in [DHW05], both of which are joint work with 
Igor Devetak and Andreas Winter. 
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Chapter 4 uses coherent classical communication from Chapter 3, the resource formalism from 
Chapter 3 and a few other tools from quantum Shannon thcory (mostly dcrandomization and 
measurement compression) to (1) derive three new communication protocols, (2) unify them 
with four old protocols into a family of relatcd resource inequalities and (3) prové converses 
that yield six different optimal tradeoff curves for communication protocols that use a noisy 
cliannel or state to produce/consume two noiseless resources, such as classical communication, 
entanglement or quantum communication. 

At the top of the family arc two purely quantum protocols that can be related by exchanging 
states with channels: the "mother" protocol for obtaining pure entanglement from a noisy 
state assisted by a perfect quantum channel, and the "father" protocol for sending quantum 
information through a noisy channel assisted by entanglement. Combining the parent protocols 
with teleportation, super-dense coding and entanglement distribution immediately yields all of 
the other "child" protocols in the family. The parents can in turn be obtained from most of 
the children by simple application of coherent classical communication. It turns out that all of 
the protocols in the family are optimal, but since they involve finite amounts of two noiseless 
resources the converses take the form of two-dimensional capacity regions whose border is a 
tradeoff curve. 

This chapter is based on joint work with Igor Devetak and Andreas Winter[DHW04, DHW05]. 
Most of the results first appeared in [DHW04] , though proofs of the converses and more careful 
derivations of the parent protocols will be in [DHW05] . 

Chapter 5 begins the part of the thesis devoted to the Schur transform. Schur duality is a way of 
relating the representations that appear when the unitary group Ud and the symmetric group 
S n act on (<C d )® n . Schur duality implies the existence of a Schur basis which simultaneously 
diagonalizes these representations; and the unitary matrix relating the Schur basis to the com- 
putational basis is known as the Schur transform. 

The chapter begins by describing general propertics of group representations, such as how 
they combine in the Clebsch-Gordan transform and how the Fourier transform decomposes the 
regular representation, using the language of quantum information. Then we go on to describe 
the Schur transform, cxplain how it can be used to understand the irreps oílAd and iS„, and 
give an idea of how Schur duality can gencralizcd to other groups. 

Nonc of the material in this chapter is new (see [GW98] for a Standard reference), but a 
presentation of this form has not appeared before in the quantum information literature. A 
small amount of the material has appeared in [BCH04] and most will later appear in [BCH06a, 
BCH06b], all of which are joint work with Dave Bacon and Isaac Chuang. 

Chapter 6 describes how Schur duality can be applied to quantum information theory in a way 
analogous to the use of the method of types in classical information theory. It begins by 
reviewing the classical method of types in Section 6.1 (following Standard texts[CT91, CK81]) 
and then colleets a number of facts that justify the use of Schur duality as a quantum method 
of types in Section 6.2 (following [GW98, Hay02a, CM04]). Section 6.3 then surveys a wide 
varicty of information theory results from the literature that are based on Schur duality. This 
section will appear in [BCH06a] and a preliminary version was in [BCH04] (both joint with 
Dave Bacon and Isaac Chuang). 

The only new results of the chapter are in Section 6.4, which gives a way to decompose n uses 
of a memoryless quantum channel in the Schur basis, and shows how the components of the 
decomposition can be thought of as quantum anàlogues of joint types. 

Chapter 7 turns to the question of computational efficiency and gives a poly (n, d, log 1/e) algorithm 
that approximates the Schur transform on (C d )®" up to aceuracy e. 
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0.2. SUMMARY OF RESULTS 



The main idea is a reduction from the Schur transform to thc Clcbsch-Gordan transform, which 
is described in Section 7.2. Thcn an cfficient circuit for the Clebsch-Gordan transform is given 
in Section 7.3. Both of these algorithms are made possible by using subgroup-adapted bases 
which arc discussed in Section 7.1. 

Section 7.2 first appeared in [BCH04] and the rest of thc chapter will soon appear in [BCH06a]. 
Again, all of this work was done together with Dave Bacon and Isaac Chuang. 

Chapter 8 explores algorithmic connections between the Schur transform and the quantum Fouricr 
transform (QFT) over iS„. We begin by presenting generalized phase estimation, in which the 
QFT is used to measure a state in the Schur basis, and then discuss some generalizations and 
interpretations of the algorithm. Then we give a reduction in the other direction, and show how 
a variant of the Standard S n QFT can be derived from onc application of thc Schur transform. 

Generalized phase estimation was introduced in the earlier versions of [BCH04] , and will appear 
along with thc other rcsults in this chapter in [BCH06b] (joint with Dave Bacon and Isaac 
Chuang) . 

Recommended background: This thesis is meant to be understandable to anyone familiar with 
the bàsics of quantum computing and quantum information theory. The textbook by Nielsen and 
Chuang[NC00] is a good place to start; Chapter 2 (or knowledge of quantum mechanics) is essential 
for understanding this thesis, Chapters 9 and 11 (or knowledge of the HSW theorem and related 
concepts) are necessary for the first half of the thesis, and Sections 4.1-4.4, 5.1-5.2 and 12.1-12.5 are 
recommended. Thc first six chapters of PreskilPs lecture notes[Pre98] are another option. Both [NC00] 
and [Prc98] should be accessible to anyone familiar with the bàsics of probability and linear àlgebra. 
Furthcr pointers to the literature are contained in Chapters 1 and 5, which rcspcctivcly introducc 
thc information theory background used in the first half of the thesis and the representation theory 
background used in the second half. 



Chapter 1 

Quantum Shannon theory 



Two communicating parties, a sender (henceforth called Alice) and a receiver (Bob), usually have, in 
a mathematical theory of communication, a predefined goal like the perfect transmission of a classical 
message, but at their disposal are only imperfect resources* like a noisy channel. This is Shannon's 
channel coding problem [Sha48]: allowing the parties arbitrary local operations (one could also say 
giving them local resources for free) they can perform encoding and decoding of the message to 
effcctively reduce the noise of the given channel. Their pcrformancc is mcasured by two paramcters: 
the error probability and the number of bits in the message. and quitc naturally they want to minimizc 
the former whilc maximizing the latter. 

In Shannon theory, we are particularly interested in the case that the channel is actually a number 
of independent realizations of the same noisy channel and that the message is long: the efnciency 
of a code is then measured by the rate, i.e., the ratio of number of bits in a message by number of 
channel uses. And in particular again, we ask for the asymptotic regimc of arbitrarily long messages 
and vanishing error probability. 

Note that not only their given channel, but also the goal of the parties, noiseless communication, 
is a resource: the channel which transmits one bit perfectly (it is "noisy" in the extreme sense of 
zero noise), for which we reserve the special symbol [c — > c] and call simply a cbit. Thus coding 
can be described morè generally as the conversion of one resource into another, i.e., simulation of 
the target resource by using the given resource together with local processing. For a genèric noisy 
channel, denoted {c — > c}, we express such an asymptotically faithful conversion of rate R as a 
resource inequality 

{c^c}> R[c^c], 

which we would like to think of as a sort of chemical reaction, and hence address the left hand side 
as reactant resource(s) and the right hand side as product resource(s) with R the conversion ratio 
between these two resoures. In the asymptotic setting, R can be any real number, and the maximum 
R is the (operational) capacity of the channel — to be precise: to transmit information in the absence 
of other resources. 

Obviously, there exist other useful or desirable resources, such as perfect correlation in the form 
of a uniformly random bit (abbreviated rbit) known to both parties, denoted [cc], or more generally 
some noisy correlation. In quantum information theory, we have further resources: noisy quantum 
channels and quantum correlations between the parties. Again of particular interest are the noiseless 
unit resources; [q — > q] is an ideal quantum bit channel (qubit for short), and [qq] is a unit of maximal 
entanglement, a two-qubit singlet state (ebit). The study of asymptotic conversion rates between the 
larger class of quantum information-theoretic resources is known as quantum Shannon theory and is 
the main focus of this half of the thesis. 

To illustrate the goals of quantum Shannon theory, it is instructive to look at the conversions 

*The term is used here in an everyday sense; later in this chapter we makc it mathematically precise. 
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pcrmitted by the unit resources [c — > c], [ç — > g] and [çç], where resource incqualitics arc finite and 
exact: the following incqualitics always refer to a specific integral number of available resources of a 
given type, and the protocol introduces no error. We mark such incqualitics by a * above the > sign. 

For example, it is always possible to use a qubit to send one classical bit, [q — ► q] > [c — ► c], and to 

distributc one ebit, [q — > q] > [qq\; the lattcr is referred to as entanglemcnt distribution (ED). 

Morc incqualitics are obtaincd by combining resources. Supcr-dcnsc coding [BW92] is a coding 
protocol to send two classical bits using one qubit and one ebit: 

[q^q] + [qq]>2[c^c]. (SD) 
Teleportation [BBC+93] is expressed as 

2[c^c] + [qq]>[q^q]. (TP) 

In [BBC + 93] the following argument was used that the ratio of 1 : 2 between [q — > q] and [c — > c] in 
these protocols is optimal, even with unlimited entanglement, and even asymptotically: assume, with 
R > 1, [q — ► q] + oo [q q] > 2R[c — ► c]; then chaining this with (TP) gives [ç — ► q] + oo[qq] > R[q — > q]. 
Hence by iteration [q — > q] + oo [q q] > i? fe [ç — > q] > i? fc [c — > c] for arbitrary fc, which can make 
R k arbitrarily large, and this is easily disproved. Analogously, 2[c — > c] + oo[qq] > ü[ç — ► g], with 
> 1, gives, when chained with (SD), 2[c — > c] + oo[gg] > 2R[c — > c], which also easily leads to a 
contradiction. In a similar way, the optimality of the one ebit involved in both (SD) and (TP) can 
be seen. 

Whilc the above demonstration looks as if we did nothing but introduce a faney notation for things 
understood perfectly well otherwise, in this chapter we want to make the case for a systematic theory 
of resource incqualitics. We will present a framework general enough to include most two-player 
setups, specifically designed for the asymptotic memoryless regime. There are three main issues 
there: first, a suitably flexible definition of a protocol, i.e., a way of combining resources (and with it 
a mathematically precise notion of a resource inequality); second, a justification of the composition 
(chaining) of resource inequalities; and third, general tools to produce new protocols (and hence 
resource inequalities) from existing ones. 

The benefit of such a theory should be clear then: while it does not mean that we get coding 
theorems "for free", we do get many protocols by canonical modifications from others, which saves 
effort and provides structural insights into the logical dependències among coding theorems. As 
the above example shows, we also can relate (and sometimes actually prové) the converses, i.e. the 
statements of optimality, using the resource calculus. 

The remainder of this chapter will systematically develop the resource formalism of quantum Shannon 
theory, and will show how it can concisely express and relate many familiar results in quantum 
information. 

Section 1.1 covers the preliminaries and describes several complementary formalisms for quantum 
mechanics, which serve diverse purposes in the study of quantum information processing. Here 
also some bàsic facts are collected. 

Section 1.2 sets up the bàsic communication scenario we will be interested in. It contains definitions 
and bàsic properties of so-called finite resources, and how they can be used in protocols. Building 
upon these we define asymptotic resources and inequalities between them, in such a way as to 
ensure natural composability properties. 

Section 1.3 contains a number of general and uscful resource incqualitics. 

Section 1.4 compiles most of the hitherto discovered coding theorems, rewritten as resource in- 
equalities. 
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Section 1.5 concludcs with a discussion of possible extensions to thc resource formalism dcveloped 
in the rest of the chapter. 

The following three chapters will apply this formalism to dcvclop new coding rcsults. 

Chapter 2 will examine the communication and entanglement-generating capacities of bipartite 
unitary gates. It is primarily based on [BHLS03] (joint work with Charles Bennett, Debbie 
Leung and John Smolin). 

Chapter 3 dcvclops thc idea of coherent classical communication and applies it to a variety of tòpics 
in quantum Shannon theory. following thc treatment of [Har04] and [HL05] (joint work with 
Debbie Leung). 

Chapter 4 shows how coherent classical communicaton can be used to derive new quantum protocols 
and unify old ones into a family of resource inequalities. This chapter, as well as the present 
one, are based on [DHW04, DHW05] (joint work with Igor Devetak and Andreas Winter). 

1.1 Preliminaries 

This section is intended to introduce notation and ways of speaking about quantum mechanical 
information scenarios. We also state several key lemmas needed for the technical proofs. Most of 
the facts and the spirit of this section can be found in [HolOl]; a presentation slightly more on the 
algebraic side is [Win99b], appendix A. 

1.1.1 Variations on formalism of quantum mechanics 

We start by reviewing several equivalent formulations of quantum mechanics and discussing their 
relevance for the study of quantum information processing. As we shall be using several of them 
in different contexts, it is useful to present them in a systematic way. The main two observations 
are, first, that a classical random variable can be identified with a quantum systems equipped with 
a preferred basis, and second, that a quantum Hilbert space can always be extended to render all 
states pure (via a reference system) and all operations unitary (via an environment system) on the 
larger Hilbert space. 

Both have been part of the quantum information processing folklore for at least a decade (the sec- 
ond of course goes back much farther: the GNS construction, Naimark's and Stinespring's theorems, 
sec [HolOl]), and roughly correspond to the "Church of the Larger Hilbert Space" viewpoint. 

Based on this hierarchy of embeddings C(lassical) => Q(uantum) => P(ure), in the above sense, 
we shall see how the bàsic "CQ" formalism of quantum mechanics gets modificd to (cmbcdded into) 
CP, QQ, QP, PQ and PP formalisms. (The second letter refers to the way quantum information 
is presented; the first, how knowledge about this information is presented.) We stress that from 
an operational perspective they are all equivalent — however, which formalism is the most useful 
depends on the context. 

Throughout the thesis we shall use labels such as A (similarly, B, C, ctc.) to denote not only 
a particular quantum system but also the corresponding Hilbert space (which is also denoted Tía) 
and to some degree even the set of bounded linear operators on that Hilbert space (also denoted 
£(Ha) or C(A)). If \ip) is a pure state, then we will sometimes use to denote the density matrix 
IV'X'01- When talking about tensor produets of spaces, we will habitually omit the tensor sign, so 
A®B = AB, etc. Labels such as X, Y, etc. will be used for classical random variables. For simplicity, 
all spaces and ranges of variables will be assumed to be finite. 
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The CQ formalism. This formalism is the most commonly used one in the litcraturc, as it captures 
most of the operational features of a "Copcnhagen" type quantum mechanics, which concerns itself 
more with the behavior of quantum systems than their meaning, reserving ontological statements 
about probabilitics, measurement outeomes, etc. for classical systems. The postulates of quantum 
mechanics can be classificd into static and dynamic ones. The static postulates define the static 
entities of the theory, while the dynamic postulates describe the physically allowed evolution of the 
static entities. In defming classes of static and dynamics entities, we will try to highlight their 
(quantum) information-theoretic significance. 

The most general static entity is an ensemble of quantum states (p x , Px)xex- The probability 
distribution (p x )xex is defined on some set X and is associated with the random variable X. The p x 
are density operators (positivc Hcrmitian operators of unit trace) on the Hilbert space of a quantum 
systcm A. The state of the quantum system A is thus correlated with the classical index random 
variable X. We refer to XA as a hybrid classical-quantum system, and the ensemble (p x , p x )xex 
is the "state" of XA. We will occasionally refer to a classical-quantum system as a ll {cq] entity". 
Special cases of {cq} entities are {c} entities ("classical systems", i.e. random variables) and {q} 
entities (quantum systems). 

The most general dynamic entity would be a map between two {cq} entities (hence, and through- 
out the thesis, we describe dynamics in the Schròdingcr picturc). Lct us highlight only a few special 
cases: 

The most general map from a {c} entity to a {q} entity is a state preparation map or a "{c — > q} 
entity". It is defined by a quantum alphabet (p x ) xe x and maps the classical index x to the quantum 
state p x . 

Next we have a {q — > c} entity, a quantum measurement, defined by a positive operator-valued 
measure (POVM) (M x ) x€ x, where M x are positive operators satisfying J2 X Mr = 1) with the identity 
operator 1 on the underlying Hilbert space. The action of the POVM (M x ) xe x on some quantum 
system p results in the random variable defined by the probability distribution (Tt pM x ) xe x on X. 
POVMs will be denoted with roman capitals: L, M, N, P, etc. 

A {q — * q} entity is a quantum operation, a completely positive and trace preserving (CPTP) 
map Af : A — ► B, described (non-uniquely) by its Kraus reprès entation: a set of operators {N x } x ^x, 
^2 X N$.N X = t B , whose action is given by 

Af(p) = J^N x pNl 

X 

(When referring to operators, we use f for the adjoint, while * is reserved for the complex conjugate. 
In Chapters 5-7, we will also apply * to representation spaces to indicate the dual representation.) 
A CP map is defined as above, but with the weaker restriction ^ X A\.A X < t B , and by itself is 
unphysical (or rather, it includes a postselection of the system). Throughout, we will denote CP and 
CPTP maps by calligraphic letters: £, A4, Af, V, etc. A special CPTP map is the identity on a 
system A, id A : A — > A, with id A (p) = p. More generally, for an isometry U : A — > B, we denote — 
for once deviating from the notation scheme outlined here — the corresponding CPTP map by the 
same letter: U(p) = UpW. 

A {q — ► cq} entity is an instrument P, described by an ordered set of CP maps (V x ) x that add up 
to a CPTP map. P maps a quantum state p to the ensemble (p x ,V x (p)/p x ) x , with p x = TrV x (p). A 
special case of an instrument is one in which V x = p x Af x , and the Aí x are CPTP; it is equivalent to 
an ensemble of CPTP maps, (p x , N x ) x çx- Instruments will be denoted by blackboard style capitals: 
L, M, N, P, etc. 

A {cq — > q} entity is given by an ordered set of CPTP maps (Af x ) x , and maps the ensemble 
(j>x, p x ) x <EX to ^2 x p x Aí x {p x )- By contrast, a {c, q — > q} map saves the classical label, mapping 

(Px, Px)xdX tO (px,Nx(Px))x£X- 

In quantum information theory the CQ formalism is used for proving direct coding theorems of 
a part classical - part quantum nature, such as the HSW theorem [Hol98, SW97]. In addition, it is 
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most suitable for computational purposes. 

For two states, we write p RA 2 o~ A to mean that the state a A is a restriction of p RA , namely 
a A = Tïrp ra . The subsystem R is possibly null (which we write R = 0), i.e., a 1-dimensional 
Hilbcrt space. Conversely, p RA is called an extension of a A . Furthermore, if p RA is pure it is called 
a purification of a R . The purification is unique up to a local isometry on R: tiris is an clcmcntary 
conscquence of the Schmidt dccomposition (discussed in Section 2.1.2). Thcse notions carry over to 
dynamic entitics as well. For two quantum operations A : A — > BE and B : A — > B we write A^_ B 
if B = Trg oA. If *4 is an isometry, is is called an isometric extension of B, and is unique up to an 
isometry on E — this and the existence of such a dílation are known as Stincspring's theorcm [Sti55] . 

Observe that we can safely represent noiseless quantum evolution by isometries between systems 
(whcrcas quantum mechanics demands unitarity): this is because our systems arc all finitc, and 
we can embed the isometries into unitàries on larger systems. Thus we lose no generality but gain 
flexibility. 

The CP formalism. In ordcr to dcfinc the CP formalism, it is necessary to review an altcrnativc 
representation of the CQ formalism that involves fewer primitives. For instanec, 

• {q}. A quantum state p A is referred to by its purification \cj)) AR . 

• {cq}, {c — > q}. The ensemblc {p x ,p A ) x [resp. quantum alphabet (p A ) x ] is similarly seen as 
restrictions of a pure state ensemble (p x , \ <fi x ) AR ) X [resp. quantum alphabet (\<t> x ) AR ) x ]- 

• {<?—> q}- A CPTP map Af : A — > B is referred to by its isometric extension Uj^ : A — > BE. 

• {q — > c}. A POVM (M x ) x on the system A is equivalent to some isometry Um '■ A — > AEx, 
followed by a von Ncumann measurement of the system Ex in basis {|cc)' Ex }, and discarding 
A. 

• {q — > cq}. An instrument P is equivalent to some isometry Up : A — > BE Ex, followed by a von 
Neumann measurement of the system Ex in basis (Ix)^}, and discarding E. 

• {c, q — ► q} The ensemble of CPTP maps (j> x ,M x ) x is identified with the ensemble of isometric 
extensions (p x ,U_\f m ) x . 

In this altcrnativc representation of the CQ formalism all the quantum static entities are thus 
seen as restrictions of pure states and all quantum dynamic entities are combinations of performing 
isometries, von Neumann measurements, and discarding auxiliary subsystcms. The CP formalism is 
characterized by never discarding (tracing out) the auxiliary subsystcms (reference systems, environ- 
ments, ancillas); they are kept in the description of our system. As for the auxiliary subsystcms that 
get (von-Neumann-) measured, without loss of generality they may bc discarded: the leftover state 
of such a subsystem may be set to a Standard state |0) (and hence decoupled from the rest of the 
system) by a local unitary conditional upon the measurement outeome. 

The CP formalism is mainly used in quantum information theory for proving direct coding theo- 
rems of a quantum nature, such as the quantum channel coding theorem (see e.g. [Dev05a]). 

The QP formalism. The QP formalism diffcrs from CP in that the classical random variables, i.e. 
classical systems, are embedded into quantum systems, thus enabling a unified treatment of the two. 

• {c}. The classical random variable X is identified with a dummy quantum system X cquippcd 
with preferred basis {|x) x }, in the state a x = ^2 x p x \x)(x\. The main difference between 
random variables and quantum systems is that random variables exist without reference to a 
particular physical implementation, or a particular system "containing" it. In the QP formalism 
this is reflected in the fact that the state a x remains intact under the "copying" operation 
A : X — > XX' , with Kraus representation {|x) x |x)'' í: (x^}. In this way, instances of the same 
random variable may be contained in diffcrcnt physical systems. 
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• {cq}. An ensemble (p x , \4>x) AR )x is represented by a quantum state 

a XAR = Y,P*\x)W X ®tè R - 

X 

• {c — > g}. A state preparation map (l^)^^ is given by the isometry ^2 \<f> x } AR \x) x (x\ x , 
followed by tracing out X. 

• {cq — > ç}. The ensemble of isometries (p x , U x ) is represented by the controllcd isometry 

\x)(x\ X ® Í7 X - 

• {q — > c}, {g — > cq}. POVMs and instruments are treated as in the CP picture, except that the 
final von Ncumann measurement is rcplaced by a completely dcphasing operation id : Ex — > X, 
defined by the Kraus representation {|a;)' 5í: (a:| £ ' x } a ;. 

The QP formalism is mainly used in quantum information theory for proving converse theorcms. 

Other formalisms. The QQ formalism is obtained from the QP formalism by tracing out the aux- 
iliary systems, and is also convenient for proving converse theorems. In this formalism the primitives 
are general quantum states (static) and quantum operations (dynamic). 

The PP formalism involves further "purifying" the classical systems in the QP formalism; it is 
distinguished by its remarkably simple structure: all of quantum information processing is described in 
terms of isometries on pure states. There is also a PQ formalism, for which we don't see much use; one 
may also conceive of hybrid formalisms, such as QQ/QP, in which some but not all auxiliary systems 
are traced out. One should remain flexible. We will usually indicate, however, which formalism we 
are using as we go along. 

1.1.2 Quantities, norms, inequalities, and miscellaneous notation 

For a state p RA and quantum operation Af : A — > B we identify, somewhat sloppily, 

Aí(p) := (id R ®Af)p RA . 

With each state p B , one may associate a quantum operation that appends the state to the input, 
namcly A p : A — ► AB, defined by 

A P (<T A )=<T A ®p B . 

The state p and the operation A p are clearly equivalent in an operational sense. 

Given some state, say p XAB , one may define the usual entropic quantities with respect to it. 
Recali the definition of the von Neumann entropy H(A) = H(A) p = H(p A ) = — Tr(p A \ogp A ), where 
p A = Ttxb p XAB ■ When we specialize to binary entropy this becomes Hi (p) := —plogp—(l—p)\ogp. 
Throughout this thesis exp and log are base 2. Further define the conditional entropy [CA97] 

H(A\B) = H(A\B) p = H(AB) - H(B), 

the quantum mutual information [CA97] 

I(A; B) = I(A; B) p = H(A) + H(B) - H(AB), 

the coherent information [Sch96, SN96] 



I(A)B) = —H(A\B) = H(B) - H(AB), 
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and the conditional mutual information 

I(A; B\X) = H(A\X) + H{B\X) - H(AB\X). 

Note that the conditional mutual information is always non-negative, thanks to strong subadditiv- 
ity [LR73]. 

It should be noted that conditioning on classical variables (systems) amounts to averaging. For 
instance, for a state of the form 

a XA = Y,P*\*)(A X ®PÍ, 

X 

H{A\X) a = Y.P* H ( A )p*- 

X 

We shall freely make use of Standard identities for these entropic quantities, which are formally 
identical to the classical case (see Ch. 2 of [CT91] or Ch. 1 of [CK81]). One such identity is the 
so-called chain rule for mutual information, 

I(A; BC) = I(A; B\C) + I(A; C), 

and using it we can derive an identity will later be uscful: 

I(X; AB) = H(A) + I(A)BX) - I(A; B) + I(X; B). (1.1) 

We shall usually work in situations where the underlying state is unambiguous, but as shown 
above, we can emphasize the state by putting it in subscript. 

We measure the distance between two quantum states p A and a A by the trace norm, 

ll/-^ A Hi, 

where ||w||i = Tr V uAw; for Hermitian operators this is the sum of absolute vàlues of the eigenvalucs. 
If \\p A — o A \\i < e, then we sometimes write that p rss o~. An important property of the trace distance 
is its monotonicity under quantum operations Af: 

In fact, the trace distance is operationally connccted to the distinguishability of the states: if p 
and a have uniform prior, Helstrom's theorem [Hel76] says that the maximum probability of correct 
identification of the state by a POVM is \ + j\\p — a\\i. 

The trace distance induces a mètric on density matrices under which the von Neumann entropy 
is a continuous function. This fact is known as Fannes' inequality[Fan73, NieOO]. 

Lemma 1.1 (Fannes). For any states p A , a A deftned on a system A of dimension d, if\\p A — o~ A \\i < 
e then 

\H(A) p -H{A) a \<e\ogd + n(e) (1.2) 

where r)(e) is defined (somewhat unconventionally) to be — eloge if e < 1/e or (loge)/e otherwise. 

Fannes' inequality leads to the following useful corollary: 

Lemma 1.2. For the quantity I(A)B) defined on a system AB of total dimension d, if \\p AB — 
<? AB \\i < e then 

\I(A)B) P - I(A)B) a \ < r)'(e)+Kelogd, 

where lim e _>o v'( e ) = an d K * s some constant. The same holds for I(A;B) and other entropic 
quantities. □ 
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Dcfinc a distance measure between two quantum operations j\4,j\Í : A\Ai — > B with rcspcct to 
somc state ui Al by 

\\M -AÍLa, := max II (id R ' ®M)Ç RMM - (id fl ®jV)Ç RAlA2 II . . (1.3) 

The maximization may, w.l.o.g., be pcrformed over puré states Ç RA ^ A ^ , This is due to the mono- 
tonicity of tracc distance under the partial trace map. Important extremes are when A\ or Ai are 
null. The first case measures absolute closeness between the two operations (and in fact, || ■ ||g is the 
dual of the cb-norm[KW04]), whilc the second measures how similar they are relative to a particular 
input state. Eq. (1.3) is written more succinctly as 

\\M-jV\\ u := max \{M -AOCIIi- 

C~//)lú 

We say that M. and M are e-close with respect to üj if 

\\M-N\\ u <e. 

Notc that || • | w is a norm only if üj has full rank; otherwise, different operations can be at distance 
0. If p and a are e-close then so are A p and A a (with respect to 0, hence every state). 
Recali the dcfinition of the fidclity of two density opcrators with respect to each other: 



F{p,a) = \\j- P ^\\\= (TrV\/W^ 
For two pure states \<p), \ip) this amounts to 

F(|^(0|,|v>^l) = K^)l 2 · 

We shall nced the following rclation between fidelity and the trace distance [FvdG99] 

1 



l-y/FM < -\\p-a\U < y/1 -F( P ,a), (1.4) 

the second inequality becoming an cquality for purc states. Uhlmann's thcorcm [Uhl76, Joz94] states 
that. for any fixed purification \<fi)(4>\ of a, 

F(p,a)= max F{\4>){4>\A<t>){<t>\)- 

As the fidelity is only defined between two states living on the same space, we are, of course, implicitly 
maximizing over extensions \ip) (ip\ that live on the same space as |^>)((^|. 

Lemma 1.3. If \\p — er||i < e and a' D a, then there exists some p 1 'D p for which \\p' — a'\\i < 2y/e. 

Proof. Fix a purification \cf)) (cj)\ ABC D a' AB D a A . By Uhlmann's theorem, there exists some 
\ip){ip\ ABC 2 P A such that 

Fm(iP\,\<t>)(cl>\) = F(p,a) >l-2e, 

using also Eq. (1.4) Define p' AB = Tic \ijj) (ip\ ABC . By the monotonicity of trace distance under the 
partial trace map and Eq. (1.4), we have 

\\p , -o'\\ 1 <\mW-\ct>){ct>\\\ 1 <2V-e, 
as advertised. □ 
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Lemma 1.4. The following statements hold for density operators oj a , u>' AA , a A , p A , íl Al , and 
quantum operations M'M' : AA'B -> C, MM ■ AB -> C, JC,C : A'S' -> C", anrf MíMí ■ 
A t A* -> A i+ il i+ i. 

i. J/w' D í/ien HM'-AT'IU' < - AA'|| W . 

i?. ||x -AA|L < lix --A/1U + 2 V /R _ ^i· 

5. ||JW®/C-JV®£|U P < ||X-A^L + ||/C-£|| P . 

(M(-io- oMi)(fl) ■ 

Proof. Straightforward. □ 

Finally, if wc havc systems Ai, A2, . . . , A„ , we use the shorthand A n = A\ , . . A n . Also, the set 
{1, . . . , d} is denotcd [d]. 



1.2 Information processing resources 

In this section, the notion of a information processing resource will be rigorously introduced. Unless 
stated otherwise, we shall be using the QQ formalism (and occasionally the QP formalism) in order 
to treat classical and quantum cntities in a unified way. 

1.2.1 The distant labs paradigm 

The communication scenarios we will be interested involve two or more separated parties. Each party 
is allowed to perforin arbitrary local operations in his or her lab for free. On the other hand, non-local 
operations (a.k.a. channels) are valuable resources. In this thesis, we consider the following parties: 

• Alice (A) 

• Bob (B): Typically quantum Shannon theory considers only problems involving communication 
from Alice to Bob. This means working with channels from Alice to Bob (i.e. of the form 
Aí : A' — > B) and arbitrary states p AB shared by Alice and Bob. However, the next two 
chapters will also consider some bidircctional communication problems. 

• Eve (E): In the CP and QP formalisms, we purify noisy channels and states by giving a share 
to the environment. Thus, we replace Af : A! — > B with the isometry Un : A' — * BE and 
replace p AB with tp ABE .* We consider a series of operations equivalent when they differ only 
by a unitary rotation of the environment . 

• Reference (R): Suppose Alice wants to send an ensemble of states {pí,\c<í) a } to Bob with 
average density matrix p A = ^iPiCtf L . We would like to give a lower bound on the average 
fidelity of this transmission in terms only of p. Such a bound can bc accomplishcd (in the 
CP/QP formalisms) by extending p A to a pure state \<f)) AR D p A and finding the fidelity of 
the resulting state with the original state when A is sent through the channel and R is left 
untouchcd[BKN00]. Here the reference system R is introduced to guarantee that transmitting 
systcm A preserves its cntanglcmcnt with an arbitrary external system. Like the environment. 
R is always inaccessible and its properties are not changed by local unitary rotations. Indeed 
the only freedom in choosing \(j>) AR is given by a local unitary rotation on R. 

*For our purposes, we can think of Eve as a passive environment, but other work, for example on private 
communication[Dev05a, BOM04], treats Eve as an active participant who is trying to maximize her information. 
In these settings, we introduce private environments for Alice and Bob E\ and Eg, so that they can perform noisy 
operations locally without leaking information to Eve. 
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• Source (S) In most coding problcms Alicc can choose how she encodes the message, but cannot 
choose the message that she wants to communicate to Bob; it can be thought of as externally 
given. Taking this a stcp further, we can idcntify the source of the message as another pro- 
tagonist (S), who bcgins a communication protocol by telling Alice which message to send to 
Bob. Introducing S is useful in cases when the Source does more than simply send a state to 
Alicc; for cxample in distributed compression, the Source distributcs a bipartite state to Alice 
and Bob. 

To each party corresponds a class of quantum or classical systems which they control or havc 
access to at different times. The systems corresponding to Alice are labeled by A (for example, A' , 
Ai, Xa, etc), while Bob's systems are labeled by B. When two classical systems, such as Xa and 
Xb, have the samc principal labcl it mcans that they arc instances of the same random variable. In 
our example, Xa is Alice's copy and Xb is Bob's copy of the random variable X. 

We turn to some important examples of quantum states and operations. Let A, B, A' , Xa and 
Xb be <i-dimensional systems with respective distinguished bases {Ix)" 4 }, etc. The Standard 

maximally entangled state on AB is given by 

\*a) AB = ±=Y.\x) A \*) B ■ 
The decohcred, "classical" , version of this state is 

x — 1 

which may be viewed as two maximally correlated random variables taking vàlues on the set [d] = 
{1, . . . , d}. The local restrictions of cithcr of these states is the maximally mixed state := ^1<í- 
(We write r to remind us that it is also known as the tracial state.) Define the identity quantum 
operation id d : A' — ► B by the isometry ^2 X \x} B (x\ A (Note that this requires fixed bases of A' 
and Bl). It represents a perfect quantum channel between the systems A' and B. Its classical 
countcrpart is the completely dcphasing channel id^ : Xa 1 —> Xb, given in the Kraus representation 
by {\x) Xb (x\ Xa ' } X £[d]- It corresponds to a perfect classical channel in the sense that it perfectly 
transmits random variables, as rcprcsentcd by density opcrators diagonal in the prefcrred basis. The 
channel : Xa> — * XaXb with Kraus representation {\x) Xb \x) Xa {x\ Xa ' } x ç[d] is a variation on 
idd in which Alice first makes a (classical) copy of the data before sending it through the classical 
channel. The two channcls are esscntially intcrchangeable. In Chapter 3 we will discuss the so-called 
coherent channel : A! — > AB, given by the isometry \x) A \x) B (x\ A which is a coherent version 
of the noiseless classical channel with feedback, A<j. Here and in the following, "coherent" is meant 
to say that the operation preserves coherent quantum superpositions. 

The maximally entangled state \§d} AB and perfect quantum channel id^ : A' — ► B are locally 
basis covariant: (U ® U*)\<&d) AB = \$d) AB and W o id d oU = idd for any unitary U. On the othcr 
hand, &d, idd, Ad and Ad are all locally basis-dependent. 

1.2.2 Finite resources 

In this subsection we introduce "finite" or "non-asymptotic" resources. They can be cither static or 
dynamic, but strictly speaking, thanks to the appcnding maps A p , we only need to consider dynamic 
ones. 

Definition 1.5 (Finite resources). A finite static resource is a quantum state p . A finite 
dynamic resource is an ordered pair ÇAf : lo), where the M : A'B' — > AB is an operation, with Alice's 
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and Bob's input systems decomposed as A' = A ahs A Tel , B' = B ahs B lel , and c^™ 1 - 8 " 1 i s a so-called 
test state. 

The idea of the resource character of states and channels (static and dynamic, rcsp.) ought be 
clear. The only thing we need to explain is why we assume that Aí comes with a test state (containcd 
in the "relative" systems A rel B rcl ): for finite resources it serves only a syntactic purpose — the 
opcration "expects" an extension of ui as input, which will play a role for the dcfinition of (vàlid) 
protocols below. The test state may not comprise the entire input to Aí, in which case the remaindcr 
of the input comes from the systems ^ abs ^ abs 

If A Tcl B Icl = 0, we idcntify (Aí : ui) with the proper dynamic resource Aí. Note that A p is always 
a proper dynamic resource, as it has no inputs. 

A resource (Aí : ui) is called pure if Aí is an isometry. It is called classical if Aí is a {c — * c} entity 
and ui is a {c} entity (though they may be expressed in the QQ formalism). 

We define a distance measure between two dynamic resources (Aí : ui) and (Aí 1 : ui) with the same 
test state as 

\\(AÍ' :u>) - (Aí : u>)\\ := \\AÍ' - tf\\ u . 
A central notion is that of comparison between resources: we take the operational view that one 

finite resource, (Aíi : u>\), is stronger than another, (AÍ2 ■ UI2), m symbols (Aíi : u>i) > (AÍ2 ■ ^2), 
if it the former can be used to pcrfcctly simulatc the latter. We dcmand first that there exist local 
operations £a ■ A' 2 — > A[ and T>a ■ A\ — > A2 for Alicc, and £g '■ B' 2 — > B[ and T>b ■ B\ — > B 2 for 
Bob, such that 

Aí 2 = (V a ®V b )AÍi(£a®£b); (1.5) 
and sccond that the simulation be vàlid, meaning that for every £1 D u>\, 

(2--=(£a®£bKi^u2. (1.6) 

When this occurs, we also say that (AÍ2 ■ ui 2 ) reduces to (Aí\ : u>i). 
Two important propcrties of this relation are that 

1. It is transitive; i.e. if (Aíi : w\) > (AÍ2 ■ w 2 ) and (AÍ2 ■ w 2 ) > (A/3 : w 3 ), then (Aíi : Wi) > (A/3 : 
W3). 

2. It is continuous; i.e. if (A/i : Wi) > (A/2 : W2), then for any channel Aí[ there exists Aí 2 such that 
(Aí{ : wi) > (A/J : wa) and ||A/^ -M|U a < ||A/Ï - M|U- 

The tensor product of states naturally extcnds to dynamic resources: 

(A/i : Wi) (g) (A/2 : W2) := (M ® A/" 2 : Wi ® w 2 ). 

Howevcr, contrary to what one might expect (Aíi <8> A2 : ^i <8> w 2 ) > (Ai : u>\) holds if and only if 
ui^ lBl can be perfectly mapped with local operations to a state ^i 5 !^^ such that ui AlBl = ui\ 
and u! A2B ' 2 = uj 2 . Thus, we will almost always consider resoures where the test state w is a product 
state. Neverthelcss, nontrivial examples exist when the tensor product is stronger than its component 
resources; for example, (Aí : ui)® 2 when ui is a classically correlated state. 

A more severe limitation on these resource comparisons is that they do not allow for small errors 
or inefficicncics. Thus, most resources are incomparable and most interesting coding theorems do 
not yield useful exact resource incqualitics. We will addrcss these issues in the next section when we 
define asymptotic resources and asymptotic resource inequalities. 

Resources as above are atòmic primitives: "having" such a resource means (given an input state) 
the ability to invoke the operation (once). When formalizing the notion of "having" several resources, 
e.g., the choice from different channels, it would be too restrictive to model this by the tensor product, 



30 



1.2. INFORMATION PROCESSING RESOURCES 



because it gives us just another resource, which thc parties have to use in a sort of "block code" . To 
allow for — finite — recursive depth (think, e.g., of feedback, where future channel uses depend on 
the past ones) in using the resources, we introducc the following: 

Definition 1.6 (Depth-£ resources). A finite depth-í resource is an unordered collection of, 
w.l.o.g., dynamic resources 

{M:uf :=((M:wi),...,(A/i:w/)). 

Both static and dynamic resources are identificà with depth-1 resources. To avoid notational con- 
fusion, for i copies of the same dynamic resource, ((M : lo), . . . , (M : u>)), we reserve the notation 
(M :lü) x£ . 

The definition of the distance measure naturally extends to the case of two depth-£ resources: 

\\(M' : w) 1 (M : LüfW := min V ||(^ : cüj) - (M w{j) ■ II- 

J u) je[£] 

Here iS^ is the set of permutations on í objeets; we need to minimizc over it to refleet thc fact that 
we're free to use depth-£ resources in an arbitrary order. 

To combine resources therc is no good definition of a tensor product (which operations should we 
takc thc produets of?), but we can take tensor powers of a resource: 

((M : w)') 9 " := ((M : ^f k , . .. , (Me : w/) 8fc ). 
Thc way we combine a dcpth-£ and a dcpth-£' resource is by concatenation: let 

(M : uf + (M' : uj'f := ((M : wj), ...,(Mf. co e ), (M{ : üo[), {M' t , : o/J,))- 

We now have to extend the concept of one resource simulating another to depth-£; at the same 
time we will introduce the notions of approximation that will become essential for the asymptotic 
resources bclow. 

Definition 1.7 (Elementary protocols). An elementary protocol P takes a depth-í finite resource 
(M : lo) 1 to a depth-1 finite resource. Given Mi : A^B^ — ► AjBi and test states loí Aí Bi , i = 1 ... i, 
P[(W : loY] is a finite depth-1 resource (V : Q A B ), with a quantum operation V : A'B 1 — ► AB, 
which is constructed as follows:* 

1. select a permutation tt of the integers {1, . . . , £}; 

2. perform local operations £ Q : A' — > AoAq ux and £' : B' — > B Bl m ; 

3. repeat, for i = 1, . . . ,£, 

(a) i perform local isometries £ t : A^Af^ -> A' t Af UK and £[ : B^Bf^ -> B ? 'Bf ux ; 

(b) i apply the operation M^ (i), mapping A^B^ to AíBí; 

4- perform local operations £e+i : A' e A^ ux — * A and £' e+1 : B' e B^ uyi — > B. 

*We use diverse notation to emphasize the role of the systems in question. The primcd systems, such as A' i} are 
channel inputs. The systems with no superseript, such as Bi, are channel outputs. Some systems are associated with 
Alice's sources (e.g. Aí el ) and Bob's possible side information about those sources (e.g. Bj ci ). Furthermore, there are 
auxiliary systems, such as A? ux . 
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We allow the arbitrary permutation of the resources tt so that depth-i resources do not have to be used 
in a fixed order. Denote by Vi the operation of performing the protocol up to, but not including, step 
3(b)i. Define Vi to be Vi followed by a restriction onto i4| _B[ el . The protocol P is called n- vàlid on 
the input finite resource (Af : ui) if the conditions 

are met for all i. 

Definition 1.8 (Standard protocol). Define the Standard protocol S, which is a 0-valid elementary 
protocol on a depth-l finite resource (Af : ui) , by 

i i 

That is, this protocol takes a list of resources, and flattens them into a depth-l tensor product. 

Whenever (Af : ui) > (Af' : ui'), thcre is a natural protocol R, which is 0-valid on (Af : u>), 
implementing the reduction: 

R[(Af:tü)} = (Af' : u>'), 

which we write as 

R: (Af:uj) > (Af' :uj'). 

For resources with depth > 1, (Af : üj) 1 = ((M : wi), . . . , (Af e : w/)) and (Af' : uj') 1 ' = ((Af{ : 

ui'i), . . . , (Afj,, : w'p)), we say that (Af : u>) > (Af' : uo') if there exists an injective function / : [(.'] — > [i\ 

such that for all i € [£'}, (Aff^ : > (A/ - / : w-). In other words, for each (Af[ : u'i) there is a 

unique (Afj : Wj) that reduces to (Af[ ■ ui'j)- Note that this implies £ > £' . Again there is a natural 
0-valid protocol R implementing the reduction. 

The next two lemmas help justify aspeets of our definition of a protocol — í^-validity and the fact 
that outputs are depth-l — that will later be crucial in showing how protocols may be composed. 

First we show why rj- validi ty is important. In general we want our distance measures for states to 
satisfy the triangle inequality, and to be nonincreasing under quantum operations. Thcsc propertics 
guarantee that the error of a sequence of quantum operations is no more than the sum of errors of 
each individual operation (cf. part 4 of Lemma 1.4 as well as [BV93]). However, this assumes that we 
are using the same distance measure throughout the protocol; when working with relative resources, 
a small error with respect to one input state may be much larger for a different input state. Thus, 
for a protocol to map approximately correct inputs to approximately correct outputs, we need the 
additional assumption that the protocol is 77-valid. 

Lemma 1.9 (Continuity). If some elementary protocol P is rj-valid on [(Af : w) e ] and 

\\(Af:u;Y-(A:wY\\<e, 

then 

\\P[(Af:u;Y}-P[(A:L,) e }\\<l(e + 2^j) 
and P[(A : lü) 1 ] is (77 + £(e + 2^/r})) -vàlid. 

Proof. Lct (V : 0) = P[(Af : uj) 1 ] and (V' : ü) = P[(A : üj) 1 }. By definition 1.7, V is of the form 

V = £í+i o Afe ° Si o • • • o Ai o E\ 
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and similar ly for V 1 . The 77- validi ty condition reads, for all i, 

\\Vi(ü)-Ui ||i < V- 

By part 3 of Lemma 1.4, 

||P-P'lb<^||A-M||^(u ) · 

By part 1 of Lcmma 1.4, 

IIA-·A/íHtmo) - W-A* ~ MihiÇiy 
By part 2 of Lemma 1.4 and 77-validity 

IIA-M-H^^) < IIA-MH^+2^ 

Hence 

which is one of the statements of the lemma. To estimate the validity of P on [(A : note that 
one obtains in the same way as above, for all i, 

\\V i -V' i \\u<i(e + 2^j). 

Combining this with the ^-validity condition via the triangle incquality finally gives 

\\Vl(p)-u i \\ l <ri + £(e + 2 y /rÍ), 

concluding the proof. □ 



We note that we do not have a concept of what it mcans to turn a depth-^ resource into a depth-£' 
resource; instead, our bàsic concept of simulation produces a depth-1 resource. I.e., we can formulate 
what it means that a depth-í resource simulates the Standard protocol of a depth-^' resource. 

The following lemma states that the Standard protocol is basically sufHcient to generate any other, 
under some i.i.d.-like assumptions. 

Lemma 1.10 (Sliding). If for some depth-i finite resource (J\í : lo) 1 = ((A/i : u>i), . . . , (Afg : u>t)) 
and quantum operation C, 

||(C:(g)^)-S[(AA:^]|| <e, (1.7) 

i 

then for any integer m > 1 and for any 77- vàlid protocol P on (Af : u>) , there exists a ((m + l — 
l)(e + 2^/77) + 7])-valid protocol P' on (C : (g) í w l ) x(m+ ^ 1) , such that 

||P'[(C : (g) Wí ) X(m+ ^ 1) ] - (P[(Af : u>) e })® m \\ < (m +£ - l)(e + 2^f). 

i 

Proof. Denoting by P' the sliding protocol (see Fig. 1-1) it is clear that 

P'[(S[(AA : u;Y]) xim+e ^] = (P[(jV : Lüf])® m . 
The rcsult follows from Lemma 1.9. □ 



The sliding protocol shows how working with depth-1 resources is not overly restrictive. Another 
difficulty with resources is that relative resources are only guaranteed to work properly when given 
the right sort of input state. Here we show that using shared randomness, some of the Standard 
relative resources can be "absolutized," removing the restriction to a particular input state. 
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Figure 1-1: The sliding protocol. We would like to simulate P, which uses A/i, . . . ,Afe consecutively, 
but we are only given A/i <8> . . -Aíe- The horizontal blocks represent uses of A/i (E> . . . ® Aíe and stacking 
them vertically indicates how we perform them consecutively with the output of one block becoming 
the input of the block above it (i.e. time flows from the bottom to the top). Thus m+l — 1 consecutive 
uses of A/i ® . . . ® Aíe can simulate m copies of P. 



Lemma 1.11. For a operation Aí : A' — > Ai? which is either the perfect quantum channel idd, the 
coherent channel Ad or the perfect classical channel \Ad, there exists a 0-valid protocol P such that 

V[$ XaX \{N:t A ')]=N®A* XaXb , 
where dim Xa = (dim A') 2 , and t a is the maximally mixed state on A! . 



Proof. Consider first the case of Aí being either idd or the coherent channel A^. The main observation 
is that there exist a set of unitary operations {U x } xe [d 2 ] (the generalized Pauli, or discrete Weyl, 
operators) such that, for any state p living on a cí-dimensional Hilbert space, 

d- 2 Y,U :cP U^T d , (1.8) 

X 

with Td being the maximally mixed state on that space. 
Let Alice and Bob share the common randomness state 

d 2 

-X A X B = d _ 2 ^ ^ ^ Xa ^ ^ ^ Xb ^ 

x=l 

where d := dimA'. Consider an arbitrary input state \(j>) RA , possibly entangled between Alice and 
a reference system R. Alice pcrforms the conditional unitary ^ \x)(x\ Xa <g> U A , yielding a state 
whose restriction to A' is maximally mixed. She then applies the operation Aí (this is 0-valid!), which 
gives the state 

d 2 

d- 2 J2\x)(x\ XA ® \x){x\ Xb ® (Aí o U A ')<f> RA '. 

x=l 

In the case of the idd channel, Bob simply applies the conditional unitary \x)(x\ Xb ® (U x r ) B . In 
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the case of the A^ channel Alice must also pcrform 



Eithcr way, the final state is 



$ XaXb ®Af(4> RA '), 



as advertised. 

The case of the perfect classical channel idd is a classical analogue of the above. The observation 
here is that there exists a set of d unitàries {U x } x e[d\ (all the eyelic permutations of the basis vectors), 
cach member of which commutes with A, such that (1.8) holds for any state p diagonal in the preferred 
basis. Now Alice first applies a local A (diagonalizing the input), before proceeding as above. This 
concludes the proof. □ 

Observe that in the above lemma, the final output of Aí is uncorrelated with the shared randomness 
that is used. In the QQ formalism, this is immediately apparent from the tensor product between Aí 
and A* . Thus we say that the shared randomness is (incoherently) decoupled from the rest of 
the protocol. 

Now consider the case when Aí = idd and use the QP formalism, so Aí is a map from A to BE. If 
we condition on a particular message sent by Alice, then the randomness is no longer decoupled from 
the composite BE systcm. This is the problem of reusing the key in a one-time pad: if the message 
is not uniformly random, then information about the key leaks to Eve. 

On the other hand, if Aí is A^ or idd then the shared randomness is decoupled even from the 
environment. This stronger form of decoupling is callcd coherent decoupling. Below we give formal 
definitions of these notions of decoupling.* 

Definition 1.12 (Incoherent decoupling). Consider a protocol P on ((Aí : oJ) e , (Aí : uj) e ), where 
(Aí : üj) 1 is classical. Recali that in the QQ formalism classical systems are unchanged under the 
copying operation A. This means we can consider an equivalent protocol in which the systems as- 
sociated with the classical resource (77 : ZJ) are copied into a composite classical system Z , which 
includes all the copies of all the random variables involved. Let P' be the modified version of P which 
retains Z in the final state. Now V := P'[((AÍ : oJ) e , (Aí : u>) e )] 2 T 3 takes a particular extension 

T RA>A*B' 2 Ü A'B> tQ SQme state a ZKABA*B> 

We say that the classical resource (77 : lo) 1 is e— incoherently decoupled (or just e— decoupled j 
with respect to the protocol P on ((Aí : oj) e ,(AÍ : U)) e> ) if for any T flA ' A * s * the state a ZRABA*B' 
satisfies 

\\a ZRABA " B " -a z ®a RABA * B *\\ x < e. (1.9) 

We describe separately how classical resources used in the input and the output of a protocol may 
be coherently decoupled. 

Definition 1.13 (Coherent decoupling of input resources). Again, consider a protocol P on 
((Aí : uj) , (Aí : u>) ), where (Aí : ZJ) is classical. Now we adopt a QP view in which all non-classical 
states are purified and all channels are isometrically extended. Again, we define a classical system 
Z which contains copies of all the classical variables associated with the resource (Aí : uí) í . The 
final state of the protocol is then some a ZRABA B E . pj/ e sa y that the classical resource (77 : ZJ) 1 is 
e— coherently decoupled with respect to the protocol P on ((77 : UjY, (Aí : loY ) if for any T RA A B 
the final state o-ZRABA'B'E satis fi es 

nZRABA'B'E _ g ^RABA'B'E» < £ _ 



*The notion of an "oblivious" protocol for remotoly preparing quantum statos is similar to coherent decoupling, but 
applies instead to quantum mcssages[LS03] . 
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Definition 1.14 (Coherent decoupling of output resources). Remaining within the QP for- 
malism, letP be a protocol mapping (J\f ■ Lü) e to {P\®T > 2 ■ £\ 1 fX 1 ^^ 2 " 82 ); *- e - ^ e tensor product of a 
classical resource (Vi : í\ 1 ) and a quantum resource (V2 ■ fi^ 2 " 82 ). Define Z to consist of copies of 
A1B1 together with all the other classical resources associated with Vi, such as outputs (if different 
from Ai) and inputs other than A\ (if any). 

We now say that the classical resource (Vi : fix) is e— coherently decoupled with respect to the 
protocol P on (Af : U)) if 

\\a ZQ -a z ®a Q \\i<e, 

where now Q comprises all the quantum systems involved (including environments and reference 
systems). 

We will give some applications of decoupling in Section 1.3, but its primary utility will be seen in 
Chapters 3 and 4. 

One simple examplc of decoupling is when a protocol involves several pure resources (i.e. isome- 
tries) and one noiseless classical resource. In this case, decoupling the classical resource is rathcr 
easy, since pure resources don't involve the environmcnt. Howcver, it is possible that the classical 
communication is correlatcd with the ancilla systcm Q that Alice and Bob are left with. If Q is 
merely discarded, then the cbits will be incoherently decoupled. To prové that coherent decoupling 
is in fact possible, we will need to carefully account for the ancillas produced by the classical com- 
munication. This will be accomplished in Section 3.5, where we prové that classical messages sent 
through isometric channels can always be coherently decoupled. 

1.2.3 Asymptotic resources 

Definition 1.15 (Asymptotic resources). An asymptotic resource a is defined by a sequence 
of finite depth-í resources (a n )^_ 1 , where a n is w.l.o.g. of the form a n = (M n : ui n ) e := {(M n ,i '■ 
w„,i), (J\í n ,2 ■ w„ )2 ), ■ • ■ , [M n j : u> n j)), such that 

oí n > ct n -i for alln; (1-10) 
• for any S > 0, any integer k and all sufftciently large n, 

«[«(1+5)] > («Ln/fcJ ) > a [n(l-S)i- (1-H) 

We sometimes refer to this as the requirement that a resource be "quasi-i.i.d." 
Denote the set of asymptotic resources by 7Z. 

Given two resources a = (ot n )^ =1 and (3 = (/3n)^Li, ^ a « Pn f° r a U sufficicntly large n, then we 
write a > (3. We shall use the following convention: if [3 = (Aí n ) n , where all Aí n are proper dynamic 
resources and 7 = (u> n ) n , where all uj n are proper static resources, then ((3 : 7) := (J\í n : w„)„. Notc 
that typically uj n is product state, so the resource 7 reduces to the null resource 0; however this is no 
problem as long as we are interested in 7 only as a test state for (3. 

Our next goal is to define what it means to simulate one (asymptotic) resource by another. 

Definition 1.16 (Asymptotic resource inequalities). A resource inequality a > (3 holds between 
two resources a = (a n ) n and /3 = (I3 n ) n if for any 5 > there exists an integer k such that for any 
e > there exists N such that for all n> N there exists an e-valid protocol p(") on (aín/fcj ) xk (i.e. 
k sequential uses of a\ n m) for which 

l|P (íl) [(« L „/ feJ ) xfe ]-S[/3 L(1 _, ) „ J ]|| < e . 
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a is called the input resource, (3 is called the output resource, S is the inefficiency ( or sometimes 
the fractional inefficiency) and e (which bounds both the validity and the error) is called the accuracy 
(or sometimes just the error). 

At first glance it may seem that we are demanding rather little from asymptotic resource inequal- 
ities: we allow the depth of the input resource to grow arbitrarily, while requiring only a depth-1 
output. However, later in this section we will use tools like the sliding lemma to show that this 
definition is ncvcrthclcss strong cnough to allow the sort of protocol manipulations we would likc. 

Also, for resources that consist entirely of states one-way channels, it is never necessary to use 
protocols with depth > 1. Thus, we state here a "flattening" lemma that will later be useful in 
proving converses; i.e. statements about when certain resource inequalities are impossible. 

Lemma 1.17 (Flattening). Suppose a > [3 and a is a "one-way" resource, meaning that it consists 
entirely of static resources (A p ) and dynamic resources which leave nothing on Alice's side (e.g. 
N A ~^ BE ). Then for any e, 5 > for sufficiently large n there is an e-valid protocol P(") on a n such 
that 

||p(' l >K]-S[/? L(1 _ 5) „j]|| <e. 

Proof. To prové the lemma, it will suffice to convert a protocol on {ct[ n /k\) y ' k to a protocol on 

{ a \n/k\ )® k . Then we can use the fact that ay n (i + $^ > )® fc and the lemma follows from a 

suitablc rcdcfmition of n and 5. 

Sincc a is a one-way resource, any protocol that uses it can be assumcd to be of the following 
form: first Alice applies all of the appending maps, then she does all of her local operations, then 
she applies all of the dynamic resources, and finally Bob does his decoding operations. The one-way 
nature of the protocol means that Bob can wait until all of Alice's operations are finished before he 
starts decoding. It also means that Alice can apply the dynamic resources last, since they have no 
outputs on her side, so none of her other operations can depend on them. Finally, the appending 
maps can be pushed to the beginning because they have no inputs. Thus (a^/kj ) xk can be simulated 
using («[«/fe] )® fc , completing the proof. □ 

Definition 1.18 (i. i. d. resources). A resource a is called independent and identically distributed 
(i. i. d.) if a n = (Af® n : w® n ) for some state lo and operation M . We use shorthand notation a = (Af : 
to). 

We shall use the following notation for unit asymptotic resources: 

• ebit [qq] := ($ 2 ) 

• rbit [cc] := ($ 2 ) 

• qubit [q->q] := <id 2 ) 

• cbit [c — > c] := (id 2 ) 

• cobit [c — ► c] := (A 2 ) (cobits will be explaincd in Chapter 3) 

In this thesis, we tend to use symbols for asymptotic resource inequalities (e.g. ll (AÍ) > C[c —> c]") 
and words for fmite protocols (e.g. "A/"®" can be used to send > n(C — S n ) cbits with error < e„"). 
However, there is no formal reason that they cannot be used intcrchangcably. 

We also can define versions of the dynamic resources with respect to the Standard "reference" 
state T A = /2 ; a qubit in the maximally mixed state. These are denoted as follows: 

• [q-^Q-r] — (id 2 : r 2 ) 



• [c — > c : t] 



= (id 2 : t 2 ) 
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. [c^c:t] :=<A 2 :t 2 ) 

Definition 1.19 (Addition). The addition operation + : 1Z x 1Z — > 1Z is defined for a = {pt n ) n , 

a n = {{Aí n ,l : U>n,l), ■ ■ ■ , (AfnJ '. Lü n .l)), and (i = (fi n )n, Pn = (U^n,l '■ ^n.l)) • • • i Wn,V : UJ 'n,l'))> as 

a + (3 = (7„)„ with 

7„ = (a„,f3 n ) := ((A/"„,i : w n ,i), . . . , (N n ,i ■ u n ,l), W n ,i ■ ^'„,i), {N' n ,v : ^n,i'))- 

Closure is trivially verificd. It is also easy to see that the operation + is associative and commu- 
tative. Namely, 

1. a + (3 = (3 + a 

2. (a + 0) + 7 = a + (fi + 7) 

Definition 1.20 (Multiplication). The multiplication operation ■ : 7Z x M + — > 7?. is defined for any 
positive real number z and resource a = («„)„ by za = (oty zn \ ) n . 

Of course, we need to verify that 1Z is indeed closed under multiplication. Define /3 := za, so that 
/3 n = a^ zn j . We know, for all sufficiently large n, that 

We need to prové 

Ot[z[n(l+S')H — ( a [z[n/kn) <g ' k > Oi[ z [n(l-í')J J ' 

which is true for the right 8' . 

Definition 1.21 (Asymptotic decoupling). Consider a resource inequality of the form a + j > (3, 
or a > 7, where 7 is a classical resource, and a and (3 are quantum resources. In either case, if in the 
definition above, for each sufficiently large n we also have that 7„ is e— (coherently) decoupled with 
respect to P™, í/ien we say </iaí 7 is (coherently) decoupled in the resource inequality. 

The central purpose of our resource formalism is contained in the following "composability" the- 
orem, which states that resource inequalities can be combincd via concatenation and addition. In 
other words, the source of a resource (like cbits) doesn't matter; whethcr they were obtained via a 
quantum channel or a carrier pigeon, they can be used equally well in any protocol that takes cbits 
as an input. A well-known examplc of composability in classical information thcory is Shannon's 
joint source-channel coding thcorcm which states that a channel with capacity > C can transmit any 
source with entropy rate < C; the coding theorem is proved trivially by composing noisclcss source 
coding and noisy channel coding. 

Theorem 1.22 (Composability). For resources inlZ: 

1. if a > (3 and f3 > 7 then a > 7 

2. if a> f3 and 7 > e then a + 7 > fò + e 

3. if a > [3 then za > z(3 

Proof. 1. Fix 5 > 0. Then there exist k,k' , such that for any e and sufficiently large n 

l|Pl[(«|n(l-<5)/(mfcfc')j) Xfc ] ~ S[/3|_„(l-25)/(mí:')j]|| ^ e i í 1 - 12 ) 
||P2[(/3L»(l-25)/(mfe')J ) Xfe ] ~ S [7Ln(l-3(5)/mj]|| < C, (l- 13 ) 
7®7l-3í)/mJ >7L«(l-45)j, (1-14) 
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with m > k'l/ 6, where / is the depth of f3, and where Pi and P2 are both e-protocols. Equation 
(1.14) implies the existence of a reduction protocol 

R-l : S[7|_ n (i_35)/ m j]' 8 ' OT > S[7|_„(i_ 4< 5)j]. 

By Eq. (1.13) 

l|P2[(/? M i-2 5 )/( mfc ')j) Xfc 'P m ~ S[7 L „(i- 3í )/mj] 8ro || < me. (1.15) 
Define t = Pi[(a[ re (i-j)/{ m tt')J )xfc]®fe' j w hi c h, by Eq. (1.12), satisfies 

lk-S[(/3 LT .( 1 _ M)/(fBfc0 j) xfc ']||<* , e. (1.16) 
Let e' = (m + fc'Z — l)(A/e + lyfe) + e. We shall exhibit an e'-valid protocol P3 such that 

||P 3 K]-S[7 L „(i_45)j]|| <e' + me. (1.17) 

By Eq. (1.11), there is a reduction R' from the initial finite resource a n to 
(aL„(i_ ( 5)/( m / £fc ')j) x L Tnfcfe '( 1+5 )J, which in turn suffices to implement t xm + fe ''- 1 . By the Sliding 
Lemma (1.10) and Eq. (1.16), there exists some e'-valid protocol P' such that 

Up^xm+fc'í-l] _ P 2 [(P ln{1 _ s y (mk>)ï )^'f m \\ < e'. 

Now we claim that the protocol P 3 := Ro P' o Pf k ® R' satisfies Eq. (1.17). Indeed P 3 [a„] = 
Ro p'[ t xm+k'J-i] ma p S a to 7 with incfficicncy S' < 4<5 + 1/m < 56, depth < mkk'(\ + S) < 
k{k') 2 l(\ + 1/6) (where k, k! depend only on 6) and error e" < e' + me. Since 6' as depth 
increases and e" — > as n — > 00, this satisfies our definition of an asymptotic protocol. 

2. We begin with the Standard quantifiers from our definition of a resource inequality: V<5 > 
0,3fc,/fc',Ve > 0,3N,yn > N 

l|Pi[(aLn/(fcfc')j) Xfe ]- S ^Wi-<5)A'j]H (1-18) 
||P 2 [(7Ln/(^')j) Xfc '] - S[ £L „ (1 _ í)/fcJ ]|| < e, (1.19) 

Rl ! Wln(l-S)/k'l > /8 L n(l-2í)j , (1-20) 

R-2 : (^L«(i-<s)/fcJ )' 8fe — £ L™(i-2<5)Ji (1-21) 
where Pi and P2 are both e-protocols. Hence the depth-(fc + k') (k + fc')e-protocol P3 given by 

Ri o P 1 [(a Ln/(fefc , )J ) xfc ]^' ® R 2 o P 2 [( 7L „/(^)J ) Xfe '] 8fe ' 

satisfies 

l|P3[((a + 7)Ln/(^')j) Xfcfc ']- S [(/? + e)Ln(i-2í)j]|| <(k + k')e. (1.22) 

3. The proof is trivial. 

□ 

It is worth noting that our definitions of resources and resource inequalities wcre carcfully chosen 
with the above theorcm in mind; as a result the proof exposes most of the important features of our 
definitions. (It is a useful exercise to try changing aspeets of our definitions to see where the above 
proof breaks down.) By contrast, the remainder of this section will establish a number of details 
about the resource formalism that mostly depend only on Eqns. (1.10) and (1.11) and not so much 
on the details of how we construct protocols and resource inequalities. 
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Definition 1.23 (Equivalent resources). Define an equivalence between resources a = (3 iffa > (3 
and (3 > a. 

Example 1.24. It is easy to see that R[qq] = ($D' n )n with D' n = [2 nH \. 
Lemma 1.25. For resources in 1Z: 

1. (zw)a = z(wa) 

2. z(a + (3) = za + z/3 

3. (z + w)a = za + wa 

Proof. 1. The > is trivial, since [-zwtoJ > [^L^^JJ- The < follows from [zuroj < zwn < [_z[_wn\ \ + 
z + 1. 

2. Immediate from the definitions. 

3. Let k = \_zm\ and k! = [wm\, where to is a parameter we will choose later. 
For any 5 and sufficiently large n (depending on S and m), 

&lzn(l+2S)] > ( a [[zn(l+8)]/k])® k , 

Oi[wn(l+2S)i > ( a [[wn{l+S)]/k']) <g ' k 7 

0-Y(z+w)n{l+2S)\ > (cïll(z+w')n(l+5)\/(k+k>)\)® ( · k+k , 

( a Hzn(l-S)\/k\ )® k > a [zn(l-2S)}> 

(ctLL-Lun(l — <5)J /fc'J ) > a|um(l-25)J , 
\®(k+k') -í, „ 

( a ll{z+w)n{l-S)i/{k+k')i) 'd. ®-\_{z+w)n(\-2&)\ ■ 

Observe: 

\zn — kn/m\ < n/m 
| \_zn\ — \kn/m\ | < n/m+1 
\[_zn\ — k\_n/m\\ < n/m + k + 2 
ILL^JAJ - L n / m JI ^ n/{km) + 2 + 2/k. 

Thus, for sufficiently large n and an appropriate choice of to, 

LLzn(l + (5)J/fcJ > [n/m\ > [[zn(l - S)\ /fcj . 

Analogously, 

LLwn(l + ó)\/k\ > [n/m\ > [\_wn{l - $)\/k\ 

and 

H(w + z)n{l + S)]/(k + k')\ > [n/m,] > [[(w + z)n{l - 5)\/{k + k')]. 



40 



1.2. INFORMATION PROCESSING RESOURCES 



Let us start with the < direction. 

<2|z«(l+2í)J ® 0<-lwn(l+28)\ > (« [ [zn(l+5)\ /fej ^ ® ( a L |wn(l+í)J /fe' J ) ? 

> (aLn/mj)® fe 8> («Ln/H)® fe ' 

s. ^ ^(fc+fe') 
* 

> a 



t[(z+w)n(l-2S)] ■ 



The > direction is proven similarly. 



□ 



Definition 1.26 (Equivalence classes of resources). Denote by a the equivalence class of a, i.e. 
the set of all oi such that a' = a. Define 1Z to be the set of equivalence classes of resources in 1Z. 
Define the relation > on 1Z by à > (3 iff oi > [3' for all a' G à and 0' € /3. Define the operation + 
on 1Z such that a + (3 is the union of oi + [3' over all oi G a and (3' G (3. Define the operation ■ on 
1Z such that za is the union of zoi over all oi G 5. 

Lemma 1.27. For resources in 1Z: 

1. a>/3 iff a> 13 

2. à + [3 = a + f3 

3. zà = za 

Proof^ Regarding the first item: it sufficcs to show the "if" direction. Indced, for any oi G a and 

/3'G/3 

d > a > [3 > [3', 

by Theorem 1.22. Regarding the second item: it sufhces to show that if oi = a, j3' = j3 then 
oi + [3' = a + [3. This follows from Theorem 1.22. Similarly, for the third item it sufíices to show 
that if oi = a then za' = za, which is true by Theorem 1.22. □ 

We now state a number of additional properties of 1Z, each of which can be easily verified. 

Theorem 1.28. The relation > forms a partial order on the set 1Z: 

1. à>a (reflexivity) 

2. if à> (3 and f3 > 7 then à > 7 (transitivity) 

3. ifà>[3 and [3>a then à — (3 (antisymmetry) 

□ 

Theorem 1.29. The following properties hold for the set 1Z with respect to + and multiplication by 
positive real numbers. 

1. (zw)à = z{wa) 

2. (z + w)à = za + wà 

3. z(a + f3)=zà + z]3 
4- 15 = 5 
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□ 

Theorem 1.30. For equivalence classes in 1Z: 

1. if à\ > 5?2 and (3\ > 02 then à\ + f3\ > 0.2 + 02 

2. ifà> then za > z0 

□ 

Warning: Lemma 1.27 has essentially allowed us to replace resources with their equivalence 
classes and = with =. Hcnceforth we shall equate the two, and drop the ~ superscript. The one 
exception to this rule is when writing relative resources as (0 : 7) where is a proper dynamic 
resourcc and 7 is a proper static resource; in this case replacing (0 : 7) with its equivalence class is 
wcll-dcfincd, but replacing and 7 with their equivalence classes wouldn't make sense. 

1.3 General resource inequalities 

In this section, we describe several resource inequalities that will serve as useful bàsic tools for 
manipulating and combining othcr resource inequalities. 

Lemma 1.31. Let f3 and (3' be proper dynamic resources, and 7 and 7' static test resources. The 
following resource inequalities hold: 

1. (3 > {[3 : 7) 

2. (/? : 7) + 7 > #7) 

3. i/7 2 7' then {(3 : 7') > (J3 : 7) 

l /3: 7 + /3':(/37)>(/?'°/3):7- 
Proof. Immediate from definitions. □ 

Lemma 1.32 (Closure). For resources in 1Z, ifwo > and wa > (3 for every w > wq then woa > (3. 
Proof. The statement is equivalent to 

w a > (1 - 5)13, V5 > 0, 

which by definition implics the statement for (5 = 0. □ 

The case of wq = is special and corresponds to the use of a sublinear amount of a resource. 
Definition 1.33 (Sublinear o terms). We write 

Oi + 07 > (3 

if for every w > 

a + ï«7 > [3. 

At the other extreme we might consider the case when we are allowed an unlimited amount of 
some resource, typically when proving converse theorems. 

Definition 1.34 (00 terms). We write 

a + 007 > (3 

if for any 8 > 0, there exists k such that for any e > there exists n±,n2 and a e-valid protocol P 
satisfying 

|| p [(aL™i/fc| + 7L" 2 /feJ ) Xfc ] ~ P[_(i~S)n\ || < e- 
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This mcans that we can use an amount of 7 that increases arbitrarily quickly with n. Note that 
007 cannot be defincd as a resource, since it violates Eq. (1.11). 

Deflnition 1.35 (Negative terms). For any z < 0, define the statement 

ot + 27 > /3 

to mean that 

ot > 13+ (-z)7- 
Similarly, a > f3 + zj means that a + (—2)7 > (3. 

Again —7 is obviously not a resource, but the above definition lets us treat it as such. 

We now rcturn to sublincar terms. In general wc cannot ncglect sublinear rcsources; e.g. in 
cntanglcment dilution, thcy are both necessary[HL04, HW03] and sufficient[LP99]. However, this 
situation only occurs when they cannot be generated from the other resources being used in the 
protocol. 

Lemma 1.36 (Removal of o terms). For a, /3, 7 G 7Z, if 

ot + 07 > (3 
za > 7 

for some real z > 0, then 

a>/3. 

Proof. For any w > 

(1 + zw)a > a + > j3, 
and the lemma follows by the Closure Lemma (1.32). □ 

One place that sublinear resources often appear is as catalysts, meaning they are used to enable 
a protocol without thcmselves being consumed. Repeating the protocol many times reduces the cost 
of the catalyst to sublinear: 

Lemma 1.37 (Cancellation). For a, j3, 7 G 1Z, if 

a + 7 > j3 + 7, 

then a + 07 > f3. 

Proof. Combine N copies of the inequality (using part 1 of Theorem 1.22) to obtain 

7 + Na > 7 + N/3. 

Divide by TV: 

N'^ + a > N-^ + P> 13. 
As TV -1 is arbitrarily small, the result follows. □ 

Often we will find it useful to use shared randomncss as a catalyst. The condition for this to be 
possible is that the randomncss be incoherently decouplcd: 

Lemma 1.38 (Recycling common randomness). If a and [3 are resources for which 



a + R[cc] > /3, 
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and the [cc] is incoherently decoupled in the above resource inequality (RI), then 

a + o[cc] > (3. 

Proof. Since [cc] is asymptotically independent of the j3 resource, by definitions 1.12 and 1.21 it 
follows that 

a + R[cc] >P + R[cc]. 

An application of the cancellation lcmma (1-37) yields the desired result. □ 
Corollary 1.39. If a > [cc] and (3 is pure then 

a + R[cc] > (3 

can always be derandomized to 

a>/3. 

Proof. It sufnces to notice that for a pure output resource (3, equation (1.9) is automatically satisfied. 

□ 

The following theorem telis us that in proving channel coding theorems one only needs to considcr 
the case where the input state is maximally mixed. A similar result was shown in [BKNOO] (see also 
[KW04, YDH05]), though with quite different techniques and formalism. 

Theorem 1.40 (Absolutization). The following resource inequalities hold: 

1. [q -> q : r] > [q -> q] 

2. [c -> c : r] > [q -> qq] 

3. [c^ c: t]>[c^ c] 

Proof. The lemma is a direct consequence of Lemma 1.11. Wc shall prové case 1., as the proofs of 2. 
and 3. are identical. By Lemma 1.11, we know that 

[q^q] : [r]+2[cc] > [q^q]+2[cc]. 

By the cancellation lemma, 

[q->q]: [r] +o[cc] > [q -> q\. 

Since 

[q-tq]-- M > [cc], 

by Lemma 1.36 the o term can be dropped, and we are done. □ 

Finally, we note how convcx combinations of static resources can be thought of as states condi- 
tioned on classical variables. 

Theorem 1.41. Consider some static i. i. d. resource a = (cr) , where 

a AX A BX B = Y^ Px \ X )(x\ Xa ® \x)(x\ Xb ® p* B . 
x 

Namely, Alice and Bob share an ensemble of bipartite states, and they both have the classical infor- 
mation about which state they hold. Denote a x = (p x ) . Then 
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Proof. Recali thc notion of thc typical sct[CT91, CK81] T such that for any e, S > and sufficiently 
large n, p® n (T) > 1 — e and for any x" £ T, 

|n x -p K n| < 5n, 

where n x is the number of occurrences of the symbol x in i". Then 



\x n ){x n \ XA ® P ® n (x n )\x n ){x n \ XB ® Px n 



The state that we want to simulatc is S\y] r p x Oí x ] — (w„)„ with 

^n = 6àp? M - 



< e. 



For any x n £ T there is, clearly, a unitary U x n <8> U xn that maps p x ™ to W([i_í] n _i) <8> /S^» exactly for 
some state p x ™ ■ Performing 

(E \^)(^\ Xa ® ® (E l*"X* n l* B ® ^ 



and tracing out subsystcms thus brings cr® n e-close to LJni-sjn-i)- Hence the claim. 
In fact, thc above result could be strcngthcned to the equality 

a = 22p x a x + H(X A ) (7 [cc], 



D 



(1.23) 



but we will not need this fact, so leave the proof as an exercise for the readcr. Howcver, wc will show 
how a similar statement to Theorem 1.41 can be made about relative resources. 

Theorem 1.42. Consider some channel M with input Hilbert space A and a state a of the form 

a RAX A X B = Y^ Px \x)( X \ Xa ® \x)(x\ Xb (g) ^ A . 

X 

Namely, Alice has an ensemble of states \4> x ), and both parties have the classical information identi- 
fying the state. Then 

X 

□ 

Proof. Wc will only givc an outline of the simulation proccdure; the proof of correctness is essentially 
the same as for the last theorem. Given a® m with m = (1 — S)n — 1, Alice will locally prepare p x 
conditioncd on x rn from a® m (which is possible sinec 4> x can bc locally prepared by Alice) , perform 
the inverse of the map U xm from the last theorem and then apply J\f® n . □ 



1.4 Known coding theorems expressed as resource inequali- 
ties 

There have been a number of quantum and classical coding theorems discovered to date, typically 
along with so-called converse theorems which prové that the coding theorems cannot be improved 
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upon. The theory of resource incqualities has been developcd to provide an undcrlying unifying 
principio. This dircction was initially suggested in [DW03b]. 

We shall state theorems such as Schumachcr comprcssion, thc classical reversc Shannon theorem, 
thc instrument compression theorem, the classical-quantum Slepian-Wolf theorem, the HSW theorem, 
and CR concentration as resource incqualities. Thcn we will show how some of these can be used as 
building blocks. yiclding transparent and concise proofs of some derivative rcsults. 

We shall work within the QQ formalism. 

Schumacher compression. The quantum source compression theorem was proven by Schumacher 
in [JS94, Sch95]. Given a quantum state p A , dcfine a B := \à A ^ B (p A ). Then the following resource 
incquality (RI) holds: 

(H(B) a + S)[q ^q]> (id A '^ B : /') (1.24) 

if and only if 6 > 0. 

Note that this formulation simultaneously expresses both the coding theorem and the converse 
theorem. 

Entanglement concentration. Thc problcm of cntanglcmcnt concentration was solved in 
[BBPS96], and is, in a certain sense, a static countcrpart to Schumacher 's compression theorem. 
Entanglement concentration can be thought of as a coding theorem which says that given a pure 
bipartitc quantum state \(f>) AB thc following RI holds: 

(4> AB ) >H{B)t\qq}. (1.25) 

The reverse direction is known as entanglement dilution [BBPS96], and thanks to Lo and Popescu 
[LP99] it is known that 

H{B)^[qq] + o[c^c]>{cp AB ). (1.26) 

Were it not for the o [c — ► c] term, we would have the equality {4> AB ) = H{B) t f > [qq]. However, 
it turns out that the o[c — ► c] term cannot be avoided[HL04, HW03]. This means that the strongest 
equality we can state has a sublinear amount of classical communication on both sides: 

H{B)^[qq]+o[c^c] = ^ AB )+o[c^c]. (1.27) 

Note how Eq. (1.27) states the converse in a form that is in some ways stronger than Eq. (1.24), 
since it implies the transformation is not only optimal, but also asymptotically reversible. We can 
also state a converse when more classical communication is allowed, though no longer as a resource 
equality: 

<X>[c^c]>{H{B)+-S) [qq] 
iff S > 0; and similarly for cntanglcmcnt dilution. 

Shannon compression. Shannon's classical compression theorem was proven in [Sha48]. Given a 
classical state p XA and defining 

Shannon's theorem says that 

{H(X B ) a + S)[c - c] > (ïd XA ^ XB : P Xa ), (1.28) 

if and only if 5 > 0. 
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Common randomness concentration. This is the classical analoguc of cntanglcmcnt conccntra- 
tion, and a static countcrpart to Shannon's comprcssion theorem. It states that, if Alicc and Bob 
have a copy of the same random variable X, embodied in the classical bipartite statc 

p x A x B = J2 Px \x)(x\ Xa ®\x)(x\ Xb , 

X 

thcn 

{ P XaXb )>H{X b ) p [cc). (1.29) 
Incidentally, common randomness dilution can do without the o term: 

H(X B ) p [cc] > ( P XaXb ). 

Thus we obtain a simple resourec cquality: 

H(X b ) p [c^c] = { P XaX -). 



Classical reverse Shannon theorem (CRST). This theorem was proven in [BSST02, Win02], 
and it gcneralizes Shannon's comprcssion theorem to compress probability distributions of classical 
states instead of pure classical states. Given a classical channel 77 : Xa> — > Yb and a classical state 
P X a> , the CRST states that 

IÍXa-^Ac^^ + HÍXaIYbUcc] > Q7:p x *'), (1.30) 

where 

* X * Y ° =MoA Xa '-* Xa,Xa (p x -'). 

Moreover, given a modified classical channel Af : Xa' — ► YaY b which also provides Alice with a copy 
of the channel output. 

the following stronger RI also holds: 

/(A A ;F s ) CT [c -» c] + H(X A \Y B )a[cc] > ÇSf' : p x *'), (1.31) 
In fact, this latter RI can be reversed to obtain the equality 

I(X A )Y B ) a [c^c] +H(X A \Y B ) a [cc] = Çrf : p x *'). (1.32) 

Howevcr, in the case without feedback, the best we can do is a tradeoff curve between cbits and 
rbits, with Eq. (1.30) representing the case of unlimitcd randomness consumption. The full tradeoff 
will be given by an RI of the following form 

a[c^ c]+b[cc} > (77 : p x ^') 

wherc (a,b) range over some convex set CRÇhí). It can be shown[Wyn75, BW05] that (a,b) £ 
CR(J7) iff therc exist channels 77 1 : X A > — * Wc< , 77 2 ■ W c > — > Y B such that 77 — 77 2 ° 77 1 and 
a > I(X A ; W c )u,b > I(X A Y B ;W C )^, where 



Classical compression with quantum side information. This problem was solved in [DW03a, 
Win99b] , and is a generalization of Shannon's classical compression theorem in which Bob has quan- 
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tum side information about the source. Supposc Alicc and Bob are given an ensemblc 

p XaB = Y / p*\ x )( x \ Xa ®p*> 

X 

and Alicc wants to communicatc Xa to Bob, which would givc thcm thc statc 

a x * B :=\à XA ^ XB ( P XaB ). 

To formalize this situation, we use the Source as one of the protagonists in the protocol, so that the 
coding theorem inputs a map from the Source to Alice and Bob (id Sx A ® ïd SB ^ B : p SxSB ) and 
outputs a map from the Source entirely to Bob. The coding theorem is then 

(ïd 5x ^ ® id s -^ B : p s - s -) + (H{X B \B) a + S)[c - c] > (ïd Sx ^ XB ® id s ^ s : p 3 * 3 *), (1.33) 

which holds iff S > 0. This formulation cnsures that we work with wcll-dcfined resources instead of 
using the natural-seeming, but incorrect (id : p x - AB ) (which violatcs Eqns. (1.10) and (1.11)). 

Of course, with no extra resource cost Alice could keep a copy of Xa- 



Instrument compression theorem. This theorem was proven in [Wm04], and is a gcneralization 
of the CRST. Given a rcmote instrument T : A! — > AXb, and a quantum state p A , the following RI 
holds: 

I(R;X B )c[c^c}+H(X B \R) a [cc] > (T : p A '), (1.34) 

where 

a RAX B =T ^RA' ) 

and \il>) (ip\ RXA 2 ç Xa ■ Moreover, given a modified remote instrument which also provides Alice with 
a copy of the instrument output, 

T' = a Xb ^* b o T, 

the RI still holds: 

I(R; X B ) a [c c] + H(X B \R) a [cc] > (T' : p A '). (1.35) 
Only this lattcr RI is known to be optimal (up to a trivial substitution of [c — > c] for [c c] ) ; indeed 

a[c —tc\-\- b[cc] > (T' : p A '). (1.36) 

iff a > I(R; X B ) a and a + b > H(X B ) a . 

By contrast, only the communication rate of Eq. (1.34) is known to be optimal; examples are 
known in which less randomncss is necessary. 



Remote state preparation (RSP) Instrument compression can be thought of as a gcneralization 
of the CRST from {c — > c} channcls to {q — > c} channels. In contrast, remote state preparation 
(proved in [BHL+05]) gcncralizcs thc CRST to {c — > q} channels. 

Let £ = YIiíPí \í)(í\ Xa ® \Í>i)(' t í J i\ AB be an ensemble of bipartite states. Define the corresponding 
{c — > q} channel Hs by 

tid\m XA ) = % m\ XA ® \^m\ AB ■ (1-37) 

This means that Ms measures the input in the Standard basis and maps outeome i to the joint state 
ijj AB . Thus, £ =Af £ (£ XA ), where £ Xa is the classical input state Y,iPi \ i )( i \ XA ■ 
The coding theorem of RSP states that 



I(X A ;B) £ [c^ c]+ H(B)[qq] > (M £ :£ Xa ), 



(1.38) 
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meaning that Alice can use the resources on the LHS to prepare a sequence of statcs l^íi) • • • \tpi„) 
of her choosing, with high fidelity on avcragc if shc chooscs i n according to p® n . Note that since 
Alice holds the purification of Bob's state, this is stronger than the ability to simulate a {c — > q} 
channel that gives Bob mixed states. The cbit cost is optimal in either case, since HSW coding 
(Eq. (1.42), below) yiclds (A/g ■ £ Xa ) > I( x a; B) e [c -> c] evcn if Alicc's half of ipf B is discarded. 
However, the entanglcment cost of Eq. (1.38) is only known to bc optimal for the setting whcn Alice 
holds the purification of Bob's output. Determining the minimal resources necessary to perform 
visible mixed-state data compression has been a long-standing open problem in quantum information 
theory.[BCF+01, KI01, Win02] 

Rcf. [BHL+05] also proved a stronger "single-shot" version of RSP, the simplest form of which is 
that n(l + o(l)) cbits and n ebits can bc uscd to prepare an arbitrary n qubit state. It it interesting 
to note that this does not form an asymptotic resource (as given in Defmition 1.15) because it fails 
to satisfy Eq. (1.11).* 



Teleportation and super-dense coding. Teleportation [BBC + 93] and super-dense coding 
[BW92] are finite protocols, and we have discussed them already in the introduction. In a some- 
what weaker form they may be written as resource inequalitics. Teleportation (TP): 

2[c^c] + [qq}>[q^q]. (1.39) 

Super-dense coding (SD): 

[í - ï] + [?«]> 2 [c -►<:]. (1.40) 

Finally, entanglcment distribution: 

[q->q]>[qq]. (1.41) 

All of these protocols are optimal (we neglect the precisc statements) , but composing them with each 
other (e.g. trying to reverse teleportation by using super-dense coding) is wasteful. We will give a 
resolution to this problem in Chapter 3 by using coherent classical communication. 

Holevo-Schumacher-Westmoreland (HSW) theorem. The direct part of this theorem was 
proven in [Hol98, SW97] and the converse in [Hol73]. Together they say that given a quantum 
channel M : A' — ■> B, for any ensemble 

p XaA ' = J2p*W( x \ Xa ® p* 

X 

the following RI holds: 

(N : p A ') > (I(Xa; B) a — <J)[c — > c], (1.42) 

iff 5 > 0, where 

a X A B =N A'^B {p X A A· ) _ 



Shannon's noisy channel coding theorem This theorem was proven in [Sha48] and today can 
be understood as a special case of the HSW theorem. One version of the theorem says that given a 
classical channel J7 : Xa> — * Ys and any classical state p XA ' the following RI holds: 

(77: P Xa ') > (I(X a ;Yb)„-S)[c->c], (1.43) 

*This has a number of interesting implications. For cxamplc, "single-shot" RSP is not amenable to the sort of 
cbit-ebit tradeoffs that are possible in the ensemble case[DB01, HJW02, BHL+05]. In fact, the exp(n) cbit cost for 
simulating single-shot RSP of n qubits is one of the few known examples where infinite, or super-linear, resources are 
useful. Also, the RSP capacities of channcls appear to be different for single-shot and ensemble RSP[Lcu04]. 
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iff 5 > and wherc 

cj XaYb :=JÍoA Xa '^ Xa ' Xa {p Xa ). (1.44) 
If wc optimizc ovcr all input states, then we find that 

(Af)>C[c^c\ (1.45) 

iff thcre cxists an input p XA ' such that C > I(Xa\ with a givcn by Eq. (1.44). 

Entanglement-assisted capacity theorem. This theorem was proven in [BSST02, Hol02]. The 
direct coding part of the theorem says that, given a quantum channel Af : A' — > B, for any quant um 
state p A the following RI holds: 

(Af : p A ') + H(R) a [qq] > I(R: B) a [c -> c], , (1.46) 

where 

a RB =N{i> RA ') 

for an arbitrary ip satisfying \ip}(ip\ RA 3 p A . 

The only converse proven in [BSST02, Hol02] was for the case of infinite entanglement: they found 
that (CN) +oo[qq] > C[c — ► c] iff C < I(R; i?) CT for some appropriate a. [Sho04b] gave a full solution 
to the tradeoff problem for entanglement-assisted classical communication which we will present an 
alternate derivation of in Section 4.2.7. 

Quantum capacity (LSD) theorem. This theorem was conjectured in [Sch96, SN96], a heurístic 
(but not universally accepted) proof given by Lloyd [Llo96] and finally proven by Shor [Sho02] and 
with an independent method by Devetak [Dev05a]. The direct coding part of the theorem says that, 
given a quantum channel Af : A' — > B, for any quantum state p A the following RI holds: 

(Af:p A ')>(I(R)B) a -6)[q^q}, (1.47) 

iff 5 > and where 

a RB =Af(i> RA ') 

for any ip RA satisfying \tp}(ijj\ RA 3 p A ■ 

Noisy super-dense coding theorem. This theorem was proven in [HHH + 01]. The direct coding 
part of the theorem says that, given a bipartite quantum state p AB , the following RI holds: 

( P AB ) + H(A) P [q^q]> I(A; B) p [c -> c]. (1.48) 

A converse was proven in [HHH + 01] only for the case when an infinite amount of (p ) is supplied, 
but we will return to this problem and provide a full trade-off curve in Section 4.2.2. 

Entanglement distillation. The direct coding theorem for one-way entanglement distillation is 
embodied in the hashing inequality, proved in [DW05a, DW04] : given a bipartite quantum state p AB , 

{p AB ) + I(A-E) 4 ,[c ^ c]> I{A)B) i ,[qql (1.49) 

where \^)(ip\ ABE 2 p AB . 

Again, the converse was previously only known for the case when an unlimited amount of classical 
communication was available[Sch96, SN96, DW05a, DW04] . In Section 4.2.5 we will give an expression 
for the full trade-off curve. 
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Noisy teleportation. This RI was discovered in [DHW04]. Given a bipartite quantum state p AB , 

(p AB ) + I(A; B) p [c^c]> I(A)B) p [g - g]. 
Indeed, letting \^)(ij)\ ABE D p AB , 

(p AB )+I(A;B)^lc->c\ = {p AB )+I(A;E) 4 ,[c^c} + 2I(A)B), 4 ,[c^c\ 

> I{A)B) i ,[qq]+2I{A)B)^[c-^c] 

> I{A)B)^[q^q). 

The first inequality follows from Eq. (1.49) and the second from teleportation. 

Classical-quantum communication trade-ofF for remote state preparation. The main cod- 
ing thcorem of [HJW02] has two intcrprctations. Viewcd as a statement about quantum compression 
with classical sidc information, it says that, given an cnscmble 

p X A ,A' = J2p x \ X ){x\ X * ®p A \ 

X 

for any classical channel ~Kf : Xa> — ► Yb, the following RI holds: 

H(B\Y B ) a [q^q}+I(X A ;Y B ) a [c^c\ > (id A '^ B : p x *' A '). (1.50) 

where 

a X A Y B B = { QJ X A'^ Y ° oA XA '^ XA ' XA )®id A '^ B )p X -' A '. 

Conversely, if a[q — > q] +b[c — > c] > (id A ^ B : p x A' A ) then there exists a classical channel N : X A — ► 
Yb with corresponding state a such that a > H{B\Y B ) cr and ò > I(Xa; Yb)<j- 

We shall now show how the proof from [HJW02] may be written very succinctly in terms of previous 
results. Define M = A B A B ojf. By the Classical Reverse Shannon Theorem (Eq. (1.31)) and 
part 3 of Lemma 1.31, 

I(X A ; Y B ) a [c -> c] + H(X A \Y B ) a [cc] > : p x * A '). 

On the other hand, Schumacher compression (Eq. (1.24)) and Theorem 1.42 imply 

H{B\Y B ) a [q ^q]> (id A '- B : N'( P XaA ')). 

Adding the two equations and invoking part 2 of Lemma 1.31 gives 

H{B\Y B ) a [q^ q}+ I{X A ;Y B ) a [c^ c] + H(X A \Y B ) a [cc] > (id^ 5 : p x * A '). 

Finally, derandomizing via Corollary 1.39 gives the desired result (Eq. 1.50). 

The result of [HJW02] may be also viewed as a statement about remote state preparation. Suppose 
we are given a classical state p x A" and a {c — > q} map M' e : X A n -»■ B,Af' e = ïd A '^ B oM £ , where 
Aíg has Kraus representation {l^)" 4 A {x\ Xa "} x . Then for any classical channel J7 : X A — > Y B , the 
following RI holds: 

H(B\Y B ) a [q^q]+I(X A ;Y B ) lT [c^c] > {Af' £ : p XA " ) , (1.51) 
where a A B is defined as above and 

p x A ,A*A' = o A Xa "^ a "^') p ^". 
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This follows from adding (Eq. (1.50)) to 

(id A '^ B : p x -' A ') > (id A '^ B :p x -' A ' a ') + ((M £ oA Xa "^ Xa " Xa '): P x -") 

> (K--p Xa ")- 

The first inequality follows from part 3 of Lcmma 1.31 and the locality of the map A/f. The second 
is an application of part 4 of Lcmma 1.31. 

Common randomness distillation. This theorem was originally proven in [DW03b]. Given an 
ensemble 

^ fl =5>*|s><s|* A ®pf, 



the following RI holds 
Armed with our thcory of resourec inequalitics, the proof becomes extrcmcly simple. 



( P XaB ) + H(X A \B) p [c c] > H(X A ) p [cc\. (1.52) 



(p x - B )+H(X A \B) p [c^c] > (p x - B ) + (A Xa ^ XaXb : p x - B ) 

> (p x * x * B ) 

> {p x A x B) 

> H(X A ) p [cc\. 

The first inequality is by classical compression with quantum side information (Eq. (1.33)), the second 
by Lemma 1.31, part 2, and the fourth by common randomness concentration (Eq. (1.29)). 

1.5 Discussion 

This chapter has laid the foundations of a formal approach to quantum Shannon thcory in which 
the bàsic elements are asymptotic resources and protocols mapping between them. Before presenting 
applications of this approach in the next three chapters, we pause for a moment to discuss the 
limitations of our formalism and possible ways it may be extended. 

The primary limitation is that our approach is most successful when considering one-way com- 
munication and when dealing with only one noisy resource at a time. These, and other limitations, 
suggest a number of ways in which we might imagino revising the notion of an asymptotic resource 
we have given in Dcfinition 1.15. For example, if we were to explore unitary and/or bidirectional 
resources more carefully, then we would need to reexamine our treatments of depth and of relative 
resources. Recali that in Dcfinition 1.16 we (1) always simulate the depth-1 version of the output 
resource, (2) are allowed to use a depth-fc version of the input resource where k depends only on the 
target inefficiency and not the target error. These features were chosen rather delicately in order to 
guarantee the convergence of the error and inefficiency in the Composability Theorem (1.22), which 
in turn gets most of its depth blow-up from the doublc-blocking of the Sliding Lcmma (1.10). How- 
ever, it is possible that a differcnt model of resources would allow protocols which deal with depth 
diffcrently. This won't make a diffcrence for one-way resources due to the Flattcning Lcmma (1.17), 
but there is evidence that depth is an important resource in bidirectional communication[KNTSZ01]; 
on the other hand, it is unknown how quickly depth needs to scale with n. 

Relative resources are another challenge for studying bidirectional communication. As we dis- 
cussed in Section 1.2.2, if p AB cannot be locally duplicated then (Af : p AB ) fails to satisfy Eq. (1.10) 
therefore is not a vàlid resource. The problem is that being able to simulate n uses of a channel on 
n copies of a correlated or entangled state is not necessarily stronger than the ability to simulate 
ri — 1 uses of the channel on n — 1 copies of the state. The fact that many bidirectional problems 
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in classical information thcory[Sha61] remain unsolvcd is an indication that the quantum versions of 
these problcms will bc difncult. On the othcr hand, it is possible that special cases, such as unitary 
gates or Hamiltonians, will offcr simplifications not possible in the classical case. 

Anothcr challcngc to our definition of a resource comes from unconventional "pscudo-rcsources" 
that resemble resourccs in many ways but fail to satisfy the quasi-i.i.d. requircmcnt (Eq. (1.11)). For 
example, the ability to rcmotcly preparo an arbitrary n qubit state (in contrast with the enscmblc 
version in Eq. (1.38)) cannot be simulated by the ability to remotely prepare k states of n(l + S)/k 
qubits each. There are many fascinating open qüestions surrounding this "single-shot" version of RSP; 
for example, is the RSP capacity of a channel ever greater than its quantum capacity?* Anothcr 
example comes from the "embezzling states" of [vDH03]. The n-qubit embczzling state can be 
prepared from n cbits and n ebits (which are also necessary[HW03]) and can be used as a resource 
for entanglement dilution and for simulating noisy quantum channels on non-i.i.d. inputs[BDH + 05]; 
however, it also cannot be prepared from k copies of the n(l + <5)//c-qubit embczzling state. Thesc 
pseudo-resources are definitely useful and interesting, but it is unclear how they should fit into our 
resource formalism. 

Other extensions of the theory will probably require less modification. For example, it will not 
a priori be hard to extend the theory to multi-user scenarios. Resources and capacities can even 
be defincd in non-cooperative situations pervasive in cryptography (see e.g. [WNI03]), which will 
mostly require a more careful enumeration of different cases. We can also consider privacy to be a 
resource. Our definitions of decoupled classical communication are a step in this direction; also there 
are expressions for the private capacity of quantum channels [Dev05a] and states [DW05a], and there 
are cryptographic versions of our Composability Theorem[BOM04, Unr04]. 



"Thanks to Debbie Leung for suggesting this question. 



Chapter 2 

Communication using unitary 
interactions 

In this chapter, we approach bipartite unitary interactions through the Iens of quantum Shannon 
theory, by viewing them as a two-way quantum channels. For example, we might try to find the 
classical communication capacity of a CNOT = |0)(0| ® / + |1}(1| ® o x with control qubit in Alice's 
laboratory and target qubit in Bob's laboratory. More generally, we will fix a bipartite gate U G 
Mdxd — Md 2 an d investigate the rate at which U can gencrate cntanglement, send classical or quantum 
messages and so on. 

This work can be applicd both to computation (in a model where local operations are easy and 
interactions are expensive) and to the rest of Shannon theory, which will be our primary focus in 
the next two chapters. Most other work on bipartite unitary gates has been more concerned with 
computational issues, but in Section 2.1 we survey the literature with an eye toward information 
theory applications. The main results of this chapter are the capacities of a bipartite unitary gate to 
create entanglement (in Section 2.2) and to send classical messages in one direction when assisted by 
an unlimited amount of entanglement (in Section 2.3). Along the way, we also establish some easily 
computable bounds on and relations between these capacities (in Sections 2.2 and 2.3) and discuss 
these capacities for some interesting specific gates in Section 2.4. We conclude with a summary and 
discussion in Section 2.5. 

Bibliographical note: Except where other works are cited, most of the results in this chapter are 
from [BHLS03] (joint work with Charles Bennett, Debbie Leung and John Smolin). However, this 
thesis reformulates them in the formalism of Chapter 1, which allows many of the definitions, claims 
and proofs to be greatly simplified. 

2.1 Background 

2.1.1 Survey of related work 

The nonlocal strength of unitary interactions was first discussed within a model of communication 
complexity, when Nielsen introduced the Schmidt decomposition of a unitary gate (described below) 
as a measure of its nonlocality[Nie98]. The idea of studying a gate in terms of nonlocal invariants — 
parameters which are unchanged by local unitary rotations — was first applied to two-qubit gates by 
[Mak02], which found that the nonlocal properties of these gates are completely described by three 
real parameters. Later these invariants would be interpretcd by [KBG01, KC01] as components of a 
useful general decomposition of two-qubit gates: for any U G ^2 X 2, there exist A\, A 2 , B\, B 2 G U% 
and 6 X , 6 y , 6 Z G (-f , \ ] such that 

U = {A 1 ® B 1 )e i(B *' T *®' J * +6 y°y®' J y +B ' a *®' J *\A2 ® B 2 ). (2.1) 
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This fact has a numbcr of uscful conscquences, but the only one we will use in this work is that 
the nonlocal part exp(i OjUj <8 <Jj) is symmetric under exchange of Alice and Bob, implying that 
(U) = (swapí/swap) for any U G U 2 x2- This symmetry no longer holds[BCL + 02] (and similar 
decompositions generally do not exist) for Udxd with d > 2. 

Othcr carly work considered the ability of unitary gates to communicate and create entanglement. 
[CLP01] showed that (cnot) + [qq] > [c—>c\ + [c^c\ and [CGBOO] proved that 21ogd([c -> c] + [c ^- 
c ] + [QQ]) — (U) for any U G % X d- The first discussion of asymptotic capacity was in [DVC + 01], 
which found the rate at which Hamiltonians can generate entanglement. Their technique would be 
adoptcd mostly unchanged by [LHL03, BHLS03] to find the entanglement capacity of unitary gates. 
In general it is difficult to exactly calculate the entanglement capability of Hamiltonians and gates, 
but [CLVV03] finds the rate at which two-qubit Hamiltonians of the form H = aa x <B> <J X + fio~y ® o v 
can generate entanglement. 

Instead of reducing gates and Hamiltonians to Standard resources, such as cbits and ebits, one 
can consider the rates at which Hamiltonians and gates can simulate one another. The question of 
when this is possible is related to the issue of computational universality, which we will not review 
here; rather, we consider optimal simulations in which fast local operations are free. [BCL + 02] found 
the optimal rate at which a two-qubit Hamiltonian can simulate another, if the time evolution is 
interspersed by fast local unitàries that do not involve ancilla systcms. [VC02b] showed that adding 
local ancilla systcms improves this rate, but that classical communication docs not. Hamiltonian 
simulation is further improved when we allow thc ancilla systems to contain entanglement that is 
used catalytically[VC02a]. 

The question of optimally gencrating two-qubit unitary interactions using a given nonlocal Hamil- 
tonian was solved (without ancillas) in [VHC02] and the proof was greatly simplificd in [HVC02]. A 
more systematic approach to the problem was developed in [KBG01], which considers systems of 
many qubits and applies its techniques to nuclear magnètic resonance. Recently, genèric gates on n 
qubits were shown by [Nie05] to require one- and two-qubit Hamiltonians to be applied for C(exp(n)) 
time. Hopcfully this work will lead to useful upper bounds on the strengths of Hamiltonians, which 
so far have been difficult to obtain. 

Finally, one can also consider the reverse problem of simulating a nonlocal Hamiltonian or gatc 
using Standard resources such as cbits and ebits. This problem has so far resisted optimal solutions, 
except in a fcw special cases, such as Gottcsman's[Got99] simulation of the cnot using [c — ► c] + [c <— 
c] + [qq]. For general dx d unitary gates, a simple application of teleportation yields 21ogd([c — » 
c] + [c <— c] + [qq]) > (U) [CGBOO] (and see also Proposition 2.8). Unfortunately this technique cannot 
be used to efficiently simulate evolution under a nonlocal Hamiltonian for time t, since allowing Alice 
and Bob to intersperse fast local Hamiltonians requircs breaking the simulated action of H into t/e 
serial uses of e~ lHt for e -> 0. This ends up requiring classical communication on thc order of t 
in order to achieve constant error. However, we would like the cost of a simulation to be linear in 
the time thc Hamiltonian is applied, so that we can discuss simulation rates that are asymptotically 
independent of thc time thc Hamiltonian is applied. If classical communication is given for free, then 
[CDKL01] shows how to simulate a general Hamiltonian for time t using 0(t) entanglement. This 
result was improved by Kitaev[Kit04], who showed how to use 0(t)([q — ► q] + [q <— q]) to simulate 
a Hamiltonian for time t. However, though these constructions are efficient, their rates are far from 
optimal. 

2.1.2 Schmidt decompositions of states and operators 

Here we review the familiar Schmidt decomposition of bipartitc quantum states[Per93, NC00], 
and explain the analogous, but less well-known, operator Schmidt decomposition for bipartitc 
operators [Nie98] . 

Proposition 2.1 (Schmidt decomposition). Any bipartite pure state \tp) G Ha ® "Hb can be 

written as \ip) = YmLi V^Ï| Q! í)a|A)b) where A,; > 0, = 1 í*- e - ^ * s a probability distribution 
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with full support) and \cti) G TÍa and \j3i) G TÍb are orthogonal sets of vectors (i.e. (cti|cKj/) — 
ifii\P%>) = Sa' ). Since these vectors are orthogonal, m < min (dini 7Ya, dini TIb)- 

Furthermore, the Schmidt rank Sch(í7) := m is unique, as are the Schmidt coefficients Xi, up to 
a choice of ordering. Therefore unless otherwise specified we will take the Xi to be nonincreasing. 
Also, for any other decomposition = X)í=i l a í)-A|^í).B (with |a'),|/3^) not necessarily orthogonal 
or normalized), we must have l > Sch(ÍJ). 

Our proof follows the approach of [NCOO] . 

Proof. The key element of the proof is the singular value decomposition (SVD). Choose orthonormal 
bases {\j)}i<j<d A and {\k)}i<k<d B f° r and H.b respectively, where úa = dim7ÍA and du = 
dim7ÍB- Then \ip) can be written as \ip) = J^. k a jk\j) A\k) b, where a is a dA x ds matrix. The SVD 
states that there exists a set of positive numbers \/Xï, ■ ■ ■ , \fX m and isometries u : C m — > C ds and 

v : C dA — > C m such that a = u- J diag(A) • v. Let |qí)a := J2j u ji\j)A and |/9»)b : = X)fc w ifc|fc)B- Since 
w and w are isometries, it follows that {|a 2 )} and {|/3j)} are orthonormal sets. 

To prové the second set of claims, note that the Schmidt coefficients are just the singular vàlues 
of the matrix a^h = (V'l ' since singular vàlues are unique, so are Schmidt coefficients. Finally 

if W) - ELi WÒaWÒb, then a jk = T! i= M\j)W and Sch(V) = ranka < l. □ 

Schmidt decomposition and entanglement manipulation: The Schmidt coefficients arc central to 
the study of bipartite pure state entanglement. For example, two states can be transformed into 
one another via local unitary transformations if and only if they have the same Schmidt coefficients. 
Thus, we usually choose entanglement mcasures on pure states to be functions only of thcir Schmidt 
coefficients. 

Moreover, the intuitivc requirement that entanglement bc nonincreasing under local operations 
and classical communication (LOCC) is equivalent to the mathcmatical requirement that entangle- 
ment measures be Schur-concave functions of a state's Schmidt coefficients. (A function / : W 1 — ► R 
is Schur-concave iff v -< w f(v) > /(iü)[Bha97]). The proof is as follows: Suppose a bi- 
partite pure state \tp) can be transformed by LOCC into \tpi) with probability pi (i.e. the state 
EiPil^K^UiBi ® \ t Pi){ t Pi\A2B 2 )- Then [Nie99a] showed that this transformation is possible if and 
only if there exist A and pi such that A = Eí-PíM* where A is the set of Schmidt coefficients of 
\ip) (ordered arbitrarily) and the \ii are the Schmidt coefficients for \ipi) (again in an arbitrary 
ordering). As a consequence, if E(\ip)) is an entanglement measurc that is a Schur-concave func- 
tion of the Schmidt coefficients of \t/j), then the expectation of E is nonincreasing under LOCC; 
i.e. E(\ip)) > ^ i pi£ , (|(/3j))[Nie99a, Nie99b]. This general principle unffies many results about en- 
tanglement not increasing under LOCC. If we take E to be the Standard entropy of entanglement 
E(ip) = H(TiBip), then we find that its expectation doesn't increase under LOCC; similarly for the 
min-entropy ií oo ('0) = — log j Trg ip\\oo- Since (Eo(tp)) a = Sch(-0) Q is Schur-concave for all a > 0, 
we also find that Schmidt number has zero probability of increasing under any LOCC transformation 
(of course, this result follows more directly from the relation A = "Y^aPíUí)- 

Operator- Schmidt decomposition: A similar Schmidt decomposition exists for bipartite linear 
operators M G C(TLa <£> Hb)* ■ Define the Hilbert-Schmidt inner product on C(Ti) by (X, Y) := 
Tr X^Y I dim7Y for any X, Y G C(7í). For example, a complete orthonormal basis for the space of 
one-qubit operators is the set of Pauli matrices, {/, X, Y, Z}. 

Let dA = dimTÍA and ds = diinTie- Then any M G C(7ía ® Hb) can be Schmidt decomposed 
into 

Sch(AÍ) 

M = J2 \f^iAi®Bi (2.2) 



*We use Nielsen's definition of operator Schmidt number from [Nie98]. In [TH00], Terhal and Horodccki defined an 
altcrnativc notion of Schmidt number for bipartite density matrices which we will not use. 
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where Sch(M) < min(d^,dg), TrA^Aj = d A Sij and Tr BjBj = íy. Normalization means that 
Tr M^M = d A d B £\ A,; typically M is unitary, so £\ K = TrM^M/d A d B = 1. 
A simple example is the cnot gate which has operator-Schmidt decomposition 

CNOT=-L | )(0|®I+-L (2.3) 

and hence has Schmidt coeficients {1/V2, l/y/2}, and Sch(CNOT) = 2. The SWAP gate for qubits has 
operator-Schmidt decomposition 

swap = ^(I ® I + X ® X + Y ®Y + Z ® Z) (2.4) 

and hence Sch(swAp) = 4. 

Most facts about the Schmidt decomposition for bipartite states carry over to bipartite operators: 
in particular, if M = A\ ® B\ + . . . + A m ® B m , then m > Sch(M). This implies a useful lemma 
(originally due to [Nie98], but further discussed in [NDD+03]): 

Lemma 2.2 (Submultiplicity of Schmidt number). Let U and V be bipartite operators and 
a bipartite state. Then 

1. Sch(UV) < Sch(í/) Sch(V) 

2. Sch(C/|V)) < Sch(C/) Sch(|V)) 

Proof. If U = J2i \ftïïAi ® Bi and V = J2j yf^j^j ® Dj are Schmidt decompositions, then UV = 
J2ij ^UiVjAiCj ® BiDj is a decomposition of UV into Sch(í7) Sch(V) terms. Therefore Sch(ÍJF) < 
Sctí(ÏT) Sch(F). 



Claim (b) is similar. If = J2j \f\j\ a j) ® then U\ip) = J2i j \/ u i^j^-i\ a j) ® Bi\bj) is a 
decomposition with Sch(tT) Sch(|V>)) terms. Thus Sch(C/|^)) < Sch([7) Sch(|V>}). □ 

Part (b) of the above Lemma provides an upper bound for how quickly the Schmidt number of a 
state can grow when acted on by a bipartite unitary gate. It turns out that this bound is saturated 
when the gate acts on registers that are maximally entangled with local ancilla systems. This is proven 
by the next Lemma, a simple application of the Jamiolkowski state/operator isomorphism[Jam72] 
that was first pointed out by Bàrbara Terhal in an unpublished comment. 

Lemma 2.3. Given Hilbert spaces A,A',B,B' with d A '■= dimA = dimA' and ds '■= dimB = 
dimB , let M G CÇHa ® TÍ-b) have Schmidt decomposition M = y/XïAi ® Bi. Then the state 
|<3?(M)) := (Mab ® Ia> B')\$d A }AA'\$d B ) bb> also has Schmidt coeficients {A.J. 

Proof. If we define \<n) = {Ai ® I)\$d A ) an d l&i) = (Bi ® í)\$d B ), then |$(M)) can be writtcn as 

\$(M)) =J2Vh\ai)\bi). (2.5) 

i 

Note that (a^aj) = Tï{A\Aj®I)$d A = Tr A\Aj/d A = íy and similarly {b t \bj) = S tj . Thus Eq. (2.5) 
is a Schmidt decomposition of |Í>(M)), and since the Schmidt coeficients are unique, |$(Af)) has 
Schmidt coeficients {Ai}. □ 



2.2 Entanglement capacity of unitary gates 

In this section, we investigate the entanglement generating capacity of a unitary interaction. Fix a 
gate U £Udxd (the generalization to d A x ds is straightforward) and let (U) denote the corresponding 
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asymptotic resource.* Thcn wc definc thc entanglement capacity E(U) to be the largest E such that 

(U) > E[qq) (2.6) 

For a bipartite pure state \ip) AB , we will also use E(\tp)) to indicate the entropy of entanglement of 
\ip); i.e. íf(A).0 = H(B)^ in thc languagc of the last chapter. 

We will start by stating some easily computable bounds on E(U), thcn provc a general exprcssion 
for the capacity and conclude by discussing some consequences. 

Simple bounds on entanglement capacity: We can establish some useful bounds on E(U) merely 
by knowing the Schmidt cocfhcicnts of U. Thc following proposition expresses these bounds. 

Proposition 2.4. If U has Schmidt decomposition U = vdA^sAïA, ® Bi, then 

H{\) = J2 -AilogAí < E(U) < H (X) = logSch(í7). (2.7) 

i 

Proof. The lower bound follows from Lemma 2.3 and entanglement concentration (recali from 
Eq. (1.25) that if ip is a bipartite pure state then (ip) > E(ip)[qq]). Thus, (U) > (&(U)) > H(X)[qq], 
where $(?/) is defined as in Eq. (2.5) and we have used the fact that E($(U)) = H(X). 

To prové the upper bound, we use Lemma 2.2 to show that n uses of U together with LOCC 
can generate only mixtures of pure states with Schmidt number < Sch(£/) n = exp(?iífo(A))- Sincc 
approximating |<!>)® nf; ( 1-15 ) to accuracy e requires a mixture of pure states with expected Schmidt 
number > (1 — e) exp(n£7(l — 6)), asymptotically we must have Hq(X) > E(U). □ 

As a corollary, any nonlocal U has a nonzero E(U). A similar, though less quantitative, result 
holds for communication as wcll. 

Proposition 2.5. If U is nonlocal then (U) > C[c — > c] for some C > 0. 

We state this here since we will need it for the proof of the next theorem, but defer the proof of 
Proposition 2.5 until Section 2.3 so as to focus on entanglement generation in this chapter. 

General formula for entanglement capacity: The main result on the entanglement capacity is the 
following method of expressing it in terms of a single use of U: 

Theorem 2.6. 

E(U) = AE V := sup E {{U AB ® I A 'B>)W)) ~ E(\j>)) (2.8) 

W^AA'BB' 

where the supremum ranges over Hilbert spaces A! , B' of any finite dimension. 

In other words, thc asymptotic entanglement capacity E(U) is cqual to thc largest single-shot 
increase of entanglement A.Eu, if we are allowcd to start with an arbitrary pure (possibly entan- 
gled) state. This result was indcpcndcntly obtaincd in [LHL03] and is based on a similar result for 
Hamiltonians in [DVC + 01]. Here we restate the proof of [BHLS03] in the language of asymptotic 
resources. 

Proof. E(U) < AEu [converse]: Consider an arbitrary protocol that uses U n times in order to 
generate w e $®niï(a)(i-5)_ We wiU 

prové a stronger result, in which even with unlimited classical 
communication U cannot generate more than AEu ebits per use. Sincc communication is free, we 
assume that instead of discarding subsystcms, Alice and Bob perform complete measurements and 
classically communicate their outeomes. Thus, we always work with pure states. 

*Note that our definition of (U) differs slightly from the definition in [BHLS03]; whereas [BHLS03] allowed n 
sequential uses of U interspersed by local operations (i.e. the depth n resource C/ x "), wc follow Definition 1.16 and 
allow only ({/®™/ fc )x fc where k depends only on the target inefficiency and not the desired accuracy of the protocol. 
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Sincc LOCC cannot increase expected entanglcmcnt and Alice and Bob start with a product 
state, their final state must have expected entanglcment < nAEu- However, by Fanncs' incquality 
(Lemma 1.1) the output state must have entanglcment > nE(U)(l — 5)(í — e) — r](e). Thus Ve, S > wc 
can choose n sufficicntly large that E(U)(l — S)(l — e) — r](e)/n < AEu, implying that E(U) < AEu* 

E(U) > AEu [coding theorem]: Assume AEu > 0; otherwise the claim is trivial. Recali from 
Eqns. (1-25) and (1-26) our formulation of entanglement concentration (-0) > E(ip)[qq] and dilution 
E{i>)[qq] + o[c -> c] > (i/j). Then 

(U) + E(i>) [qq] > (U) + o[c -> c] + E(lP) [qq] > (U) + (V) > (Uty)) > E(U\ $)) [qq] , (2.9) 

where we have used Proposition 2.5 in the first inequality, entanglement dilution in the second in- 
cquality, and entanglement concentration in the last inequality. Using the Cancellation Lemma (1.37), 
we find that (U) + o[qq] > E(U\ip)) — E(\ip))[qq], and the sublinear [qq] term can be removed due to 
Proposition 2.4 and the fact that AE V > implies Sch(í7) > 1. Thus (U) > E{U\ip))~ E{\ip))[qq] for 
all tp. Taking the suprcmum over ip and using the Closurc Lemma (1.32) yields the desired result. □ 

The problem of finding E(U) is now rcduced to calculating the supremum in Eq. (2.8). To help 
understand the propcrties of Eq. (2.8), we now consider a number of possible variations on it, as well 
as some attempts at simplification. 

• Restricting the size of the ancilla appears hard: Solving Eq. (2.8) rcquircs optimizing over ancilla 
systcms A' and B' of unboundcd size. Unfortunately, we don't know if the supremum is achieved 
for any finitc dimcnsional ancilla size, so we can't givc an algorithm with boundcd running time 
that rcliably approximates E(U). On the one hand, we know that ancilla systcms arc sometimes 
necessary. The two-qubit SWAP gate can generate no entanglement without entangled ancillas, 
and achieves its maximum of 2 ebits when acting on |<I>)a a' |*&)b b' ! a separation that is in a 
sense maximal. On the other hand, some gates, such as CNOT, can achieve their entanglement 
capacity with no ancillas. Less trivially, [CLVV03] proved that two-qubit Hamiltonians of the 
form H = aX <g> X + f3Y ® Y can achieve their entanglement capacity without ancilla systems, 
though this no longer holds when a Z Cg> Z term is added. 

It is reasonablc to assume that even when ancilla are necessary, it should sufficc to take them 
to be the same size as the input systems. Indeed, no examples are known where achieving the 
entanglement capacity requires dimA' > dimA or dimB' > dimB. On the other hand, there 
is no proof that the capacity is achieved for any finitc-dimensional ancilla; we cannot rule out 
the possibility that there is only an infinite sequence of states that converges to the capacity. 

• Infinite dimensional ancilla don't help: Though wc cannot put an upper bound on the necessary 
dimensions of A' and B', we can assume that they are finitc dimensional. In other words, we will 
show that AEu is unchanged if we modify the sup in Eq. (2.8) to optimizc over 6 TÍa a' b b' 
s.t. E(ip) < oo and dim7ÍA' = dim7ÍB' — °°- Denote this modified supremum by AE V . Wc 
will prové that AEu = AE'jj. 

First, we state a useful lemma. 

Lemma 2.7. Any bipartite state \rp) with E{ip) < oo can be approximated by a series of states 
I V2) 5 ■ - ■ j each with finite Schmidt number and obeying [[tf> — tp n \\ 1 log Sch((p n ) — » as 
n — > 00. (In other words, the error converges to zero faster than 1/ \ogSch(ip n ) .) 

Proof. Schmidt decompose as \ip) = "v/AÏ]*) |«) and define the normalized state \ip n ) = 

Yh=i V^ï\i)\i) / VJ2í=i Ai. Lct 6 n := ^\\ip- <p n \\i = J2i> n ^- Now ; use the fact tliat E i^) < 00 

*A more formal (and general) version of this argument will also appear in the proof of Theorem 3.7 in Section 3.4.2. 
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and A„ < 1/n to obtain 

n oo oo 

B(V)-5^A i log(l/A i )= J2 Wog(l/\i)> J2 Ailog(l/A„) = <5„log(l/A„)>5„logn 

2—1 i— n+ 1 i— n+1 

(2.10) 

Since the term on the left converges to as n — ► oo, we also have that <5 rl logn — > as n — > oo. 
Using n = Sch ((/?„) and <5 rl = ^IIV' — Vn||i, our desired result follows. □ 

Now Ve > 0,3\tp) G TÍaa'bb' with àimHA' = dimWs' = oo such that E(U\ip}) - E(\ijj}) > 
AE'u — e. By Lcmma 2.7, we can choose \tp) with Sch(c/?) < oo (and thus can belong to 
Haa'bb' with dimA',dimB' < oo) such that \\ip — tp\\ i log Sch(<p) < e. By Fanncs' incquality 
(Lemma 1.1), \Efa) - Efa)\ < e + r](e)/ log Sdnp and \E(U\ip)) - E(U\<p))\ < (e + r/(e)){l + 
(logSch(< / ?))/(logSch(C7))) (since Sch(C7») < Sch(U) Sch(tp)) . Combining these, we find that 
E(U\<p)) - E(\<p)) -> AE^ as e ^ 0, implying that AE V = AE V . 

• Sometimes it helps to start with entanglement: Subtracting one entropy from another in 
Eq. (2.8) is rather ugly; it would be nice if we could climinate the second term (and at the 
same time restrict dim A' < dim A and dim B' < dim B) by maximizing only over product 
statc inputs. However, this would result in a strictly lowcr capacity for some gates. This is 
seen most dramatically for Hamiltonian capacities, for which ^Efa tHt \a)\l3)) = for any 
\a) G Haa'iP G TÍbb'j due to the quantum Zeno effect: after a small amount of time i, the 
largest Schmidt coefficient is 1 — 0(t 2 ). The same principle applies to the gate U = e~ lHt for i 
sufficiently small: the entanglement capacity is 0(t) (because 0(1/ 1) uses of U give a gate far 
from the identity with 0(1) entanglement capacity), though the most entanglement that can 
be created from unentanglcd inputs by one use of U is 0(t 2 log(l/í)). 

As a corollary, the lower bound of Proposition 2.4 is not tight for all gates. 

• Mixed states need not be considered: We might also try optimizing over density matrices rather 
than pure states. For this to be meaningful, we need to rcplace the entropy of entanglement 
with a measure of mixed-state entanglement [BDSW96], such as entanglement of formation 
Ef{p) := mm{J2 i PiEfa) : p = J2 l Pfa}> entanglement cost E c (p) := inf m ^Ef(p® m ) = 
inf{e : e[qq] + oo[c — > c] > (/?}}, or distillable entanglement D(p) := sup{e : (p) + oo[c — > 
c] + oo[c <- c] > e[qq}} [BDSW96] . 

Weclaim that AE V = sup p E f (U(p))-E f (p) = su Pp E c (U(p))- E c (p) = sup p D(U(p)) — E c (p). 
To prové this for Ef, decompose an arbitrary p AB into pure states as p = J^jï'iV'i s -t- Ef(p) = 
^2iPiEfa). Now we use the convexity of Ef to show that Ef(U(p)) = Ef(^2 i piUfa)) < 
J2íPí E ( U {^í))' implying that 

E f (U(p)) - E f (p) < V K [Efa fa)) - Efa)] < rmwEfafa)) - Efa). 

i 

Thus, any increase in Ef can be achieved by a pure state. 

A similar, though slightly morc complicated, argument applies for E c . For any e > and 
any p, there exists m sufficiently large that E c (p) + e > -^- l 'YliiPiE(\pi) for some {pi,ipi} such 
that p® m = J^íPí^í- Using first the definition of E c and then convexity, we have E c (U(p)) < 
±E f (U(p)®™) < ±T,iPiE(U® m Wi))- Thus, 

E c (U(p)) - E c (p) -e<-J2 Pl [E(U® m fa)) - Efa)] < meüt(E(U® m fa)) - Efa))/m 

i 

< max max E((U® j ® I® m ~ j )fa)) - E^U®'' 1 ® I® m ~ i+l )fa)) < AE V . 

i }&{!,.. .,m} 
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This proof implicitly uses the fact that E(U) is (sub)additive; i.e. E(U m ) = 2E{U). 

Finally, AEjj = sup p D(U(p)) — E c {p) because of the E c result from the last paragraph and 
the fact that D(p) < E c (p). This case corresponds to the operationally reasonable scenario of 
paying E c (p) [qq] for the input state and getting D(U(p))[qq] from the output state. Of course, 
this case also follows from the fact that classical communication doesn't hclp cntanglemcnt 
capacity. 

Contrasting the entanglement capacity of unitary gates and noisy quantum channels: The problem 
of generating entanglement with a unitary gate turns out to have a number of interesting diffcrences 
from the analogous problem of using a noisy quantum channel to share entanglement. Here we survey 
some of those diffcrences. 

• Free classical communication doesn't help: In the proof of the converse of Theorem 2.6, we 
observed that unlimited classical communication in both directions doesn't increase the en- 
tanglement capacity. For noisy quantum channels, it is known that forward communication 
doesn't change the entanglement capacity [BDSW96], though in some cases back communica- 
tion can improve the capacity (e.g. back communication increases the capacity of the 50% 
erasure channel from zero to 1/2) and two-way communication appears to further improve the 
capacity [BDSS04]. 

• Quantum and entanglement capacities appear to be different: A noisy quantum channel Ai has 
the same capacity to send quantum data that it has to generate entanglement (i.e. (Af) > Q[q — ► 
q] iff (Ai) > Q[qq])[BDSW96], though with free classical back communication this is no longer 
thought to hold[BDSS04]. Sincc unitary gates are intrinsically bidircctional, wc might instead 
ask about their total quantum capacity Q + (U) := max{Qi + Q2 : (U) > Q\[q — > g] + Q2 [q <— <?]} 
and ask whether it is equal to E(U). All that is currently known is the bound Q + (U) < E(U), 
which is saturatcd for gates like CNOT and SWAP. Howcvcr, in Scction 2.4.3, I will give an 
examplc of a gatc that appears to have Q+(U) < E(U), though this conjecture is supported 
only by heuristic arguments. 

• Entanglement capacities are strongly additive: For any two bipartite gates U\ and U2, we have 
E(Ui <g> U2) > E(Ui) + E(U2), since we can always run the optimal entanglement generat- 
ing protocols of U\ and U2 in paral·lel. On the other hand, E(U\ ® U2) = sup^ E((Ui ® 
U 2 )W)) - E(\i/>)) = sup^ [E{{Ux E/ 2 )|V» - E((Ui ® I)\m + [E((Ui ® J)|V» - E(m] < 
AE U2 + AE U± = E(U 2 ) + E(Ui). Thus E(Ui ® U 2 ) = E{U X ) + E(U 2 ). 

In contrast, quantum channel capacities (cquivalcntly either for quantum communication or 
entanglement generation) appear to be supcradditivc[SST01]. 

• Entanglement capacities are always nonzero: If U is a nonlocal gate (i.e. cannot be written as 
U = V ' a ® Ub), then according to Proposition 2.4, E(U) > 0. On the other hand, there exist 
nontrivial quantum channels with zero entanglement capacity: classical channels cannot create 
entanglement and bound entangled channels cannot be simulated classically, but also cannot 
create any pure entanglement. 

2.3 Classical communication capacity 

Nonlocal gates can not only create entanglement, but can also send classical messages both forward 
(from Alice to Bob) and backwards (from Bob to Alice). Therefore, instead of a single capacity, we 
need to consider an achievable classical rate region. Dcfine CC(U) := {{Ci.C^) ■ (U) > C\[c— * c] + 
Ci[c <— c]}. Some useful special cases are the forward capacity C_ > (Í7) = max{Ci : (C\, 0) G CC(U)}, 
backward capacity C<_(Í7) = max{C*2 : (0, C%) G CC(Í7)} and bidirectional capacity C + (U) = 
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max{Ci + C 2 : (C\,C 2 ) G CC(U)}. (By Lemma 1.9 CC(U) is a closed set, so these màxima always 
exist.) 

We can also consider the goal of simultaneously transmitting classical messages and generating 
entanglement. Alternatively, one might want to use some entanglement to help transmit classical 
messages. We unify these scenarios and others by considering the three-dimensional rate region 
CCE(C7) := {(d,C 2 ,E) : (U) > d[c -> c] + C 2 [c <- c] + E[qq}}. When some of C U C 2 and E are 
negative, it means that the resource is being consumed; for example, if E < and C\ , C 2 > 0, then the 
resource inequality (U) + (-E) [qq] > C\ [c — > c] + C2 [c <— c] represents entanglement-assisted commu- 
nication. Some uscful limiting capacities are CE,{U) '■= max{Ci : (Ci,0, — 00) 6 CCE([/)}, C^_(U) := 
max{C 2 : (0,C 2 ,-oo) G CCE([/)} and Cf (Í7) := max{Ci + C 2 : (C u C 2 ,-oo) G CCE([/)}. 

To get a sense of what these capacity regions can look like, Fig. 2-1 contains a schematic diagram 
for the achievable region CC(U) and the dcfinitions of the various capacities when we set E = 0. We 
present all the known properties and intentionally show the features that are not ruled out, such as 
the asymmetry of the region, and the nonzero curvature of the boundary. 



C 2 




Figure 2-1: Example of a possible achievable rate region CC(U), with the limiting capacities of 
C_> , C<_ and C+ indicated. 

There are much simplcr cxamples - the unassisted achievable region for CNOT and SWAP are similar 
triangles with vertiecs {(0,0), (0,1), (1,0)} and {(0,0), (0,2), (2,0)} rcspcctively (see Section 2.4.1). 

In general, little is known about the unassisted achievable region of (Ci, C 2 ) besides the convexity 
and the monotonicity of its boundary. The most perplexing question is perhaps whcthcr the region has 
reflective symmetry about line C\ = C 2 , which would imply that C_ > ([/) = C<~(U). Eq. (2.1) shows 
that any two-qubit gate or Hamiltonian is locally equivalent to one with Alice and Bob interchanged, 
so that the achievable region is indeed symmetric. In highcr dimensions, on the other hand, [BCL + 02] 
shows that there arc Hamiltonians (and so unitary gates) that are intrinsically asymmetric. However, 
it remains open whcthcr the achievable rate pairs are symmetric, or more weakly, whether = C^. 

The rest of this section is as follows: 

Section 2.3.1 proves some bàsic facts about the achievable classical communication region. Then 
we establish some bounds on communication rates similar to, but weaker than, the bounds on 
entanglement rate in Proposition 2.4. 

Section 2.3.2 proves a capacity formula for CE>(U) (or equivalently C^_{U)) that parallels the for- 
mula in Theorem 2.6. This formula will be improved in the next chapter when we introduce 
coherent classical communication. 

Section 2.3.3 discusses relations between the classical communication and the entanglement gener- 
ation capacities of unitary gates. 
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Section 2.3.4 explores thc difhculties involved in proving capacity theorems for bidirectional com- 
munication. 

2.3.1 General facts about the achievable classical communication rate re- 
gion 

Wc begin with some bàsic facts about CCE. 

• Monotonicity: If (d,C 2 ,E) G CCE(Í7) then (d - 5 U C 2 -S 2 ,E- S 3 ) G CCE(Í7) for any 
Si, 5 2l S3 > 0. This is because we can always choose to discard resources. 

• Convexüy: CCE([/) is a convex set. This follows from time-sharing (part 2 of Theorem 1.22 
and part 3 of Lemma 1.25. 

• Classical feedback does not help: If (Ci,C 2 ,E) £ CCE(Í7), then (Cx,0,E) G CCE([/) and 
(0,C 2 ,E) G CCE([/). We mention this fact now, but defer the proof until Chapter 3. 

Combining this with monotonicity and the fact that classical feedback doesn't improve entan- 
glcmcnt capacity, we obtain as a corollary that CCE([7) Ç [— co, C^{U)\ x [— oo 1 C^L(U)\ x 
[— oo, E(U)} Ç [oo,21ogd] x [oo,21ogd] x [oo,21ogcí]. This second inclusion depends on Propo- 
sition 2.8, proven below. 

• No more than E(U r ) ebits are ever needed: If(d,C 2 ,E) G CCE(Í7), then (Ci,C 2 ,~E(W)) £ 
CCE([/). A proof of this will be sketched in Section 2.3.3, and it also follows from Theorem 3.1 
in the next chapter. 

• Shared randomness does not help: If (U) + oo[cc] > Ci[c — > c] + C 2 [c <— c] + E[qq], then 



This is due to a Standard derandomization argument (further developed in [CK81, DW05b]). 
Let r denote the shared randomness and let x := (a, b) run over all possible messages sent by 
Alice and Bob with n uses of U (a set of size < exp(Cn) for C := C\ + C 2 ). If e x , r is the 
corresponding probability of error, then our error-correcting condition is that max^ Ej-e^,. < e. 
Now sample m copies of the shared randomness, (r 1; . . . , r m ) ~: r, where m is a parameter we 
will choose later. According to Hoeffding's incquality[Hoc63], we have 



for any particular value of x. We apply the union bound over all < exp(Cn) vàlues of x to 
obtain 



Thus, if we choose m > 2Cn/e 2 , then there exists a choice of r with maximum error < 2e. If 
Alice and Bob preagree on r, then they need only logm bits of shared randomness to agree 
on which Ti to use. Since logm = 0(logri + log(l/e)), this randomness can be generated by a 
negligible amount of extra communication. 

We now state an upper bound, originally due to [CGB00]. 

Proposition 2.8. If U G U dxd , then C^(U) < 21ogd and C^_(U) < 2\ogd. 

Proof. The proof is based on simulating U with teleportation: Alice teleports her input to Bob using 
2 log d[c — > c] + log d[qq], Bob applics U locally (and hence for free), and then Bob teleports Alice's 



(C 1: C 2 ,E) G CCE(U). 




(2.11) 




(2.12) 
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half of the state back using 21ogd[c <— c] + \ogd\qq\. Thus we obtain the resource incquality 

2 log d ([c c] + [c <- c] + [gg]) > log d ([g -+ g] + [g <- ç]) > (U) (2.13) 

Allowing free entanglement and back communication yields 2 log d[c — ► c]+oo[q <— g] > C^(U)[c — > c]. 
Causality[Hol73] implies that C^(U) < 2 log d. A similar argument proves that C^_(U) < 2 log d. □ 

It is an interesting open question whether any good bounds on classical capacity can be obtained as 
functions of a gate's Schmidt coefficicnts, as we found with Proposition 2.4 for the casc of entanglement 
generation. 

We now prové Proposition 2.5, which stated that any nonlocal U has a nonzero classical capacity. 
An alternate proof can bc found in [BGNP01]. 

Proof of Proposition 2. 5. Let Eq the amount of entanglement created by applying U to the AB 
registers of \&d) AA'\$d) bb 1 ■ If U is nonlocal, then Eq > according to Proposition 2.4. 

Alice can send a noisy bit to Bob with the following í-use protocol. Bob inputs to all í 

uses of U. To send "0" Alice inputs l^)®^' to share tEç, ebits with Bob, i.e. inputting a fresh copy 
of each time. To send "1", Alice inputs |0)a to the first use of U, takes the output and uses it 
as the input to the second use, and so on. Alice only interacts a d-dimensional register throughout 
the protocol, so their final entanglement is no more than log d Thus different messages from Alice 
result in very different amounts of entanglement at the end of the protocol. 

Let po and p\ denote Bob's density matrices when Alice sends or 1 respectively. Using Fannes' 
inequality (Lemma 1.1), tE Q - logd < logd ||po - Pi\\i + ^p- If we choose t > (log d + l -^)/E , 
then po ^ pi and Bob has a nonzero probability of distinguishing po from p\ and thereby idcntifying 
Alice's message. Thus the í-use protocol simulates a noisy classical channel with nonzero capacity 
andCU(£/)>0. □ 

2.3.2 Capacity theorem for entanglement-assisted one-way classical com- 
munication 

We conclude the section with a general expression for C^(U). Though we will improve it in Chapter 3 
to characterize the entire one-way tradeoff region CE([7) := {(C, E) : (C, 0, E) € CCE([/)}, the proof 
outlincs uscful principies which we will later use. 

First, we recali some notation from our definition of remote state preparation (RSP) in Section 1.4. 

Let 

£ = ^> m XA ® \^m AiMBiB2 (2.i4) 

i 

be an ensemble of bipartite states \ipi), where Alice holds the index i, U acts on AiBi and ^2,i?2 
are ancilla spaces. Thus we can define U (£) by 

U{£) := J^Pi M\ Xa ® ( uMBl ® l SlB2 )(l^>(^l^ lA2BlB2 ) (2.15) 

i 

We will use A to denote the composite system A1A2 and B to denote BxB^. As in Section 1.4, define 
the {c -> q} channel Aí £ by Af £ {\i)(i\) = \i)(i\ ® Vi, so that that £ = N c E{£ Xa ). Defining N u{£) 
similar ly, we can use Lemma 1.31 to show that 

(N £ : £ Xa ) + (U) > (Uo M £ : £ Xa ) = (N u(£) : £ Xa ). (2.16) 

Recali from HSW coding (Eq. (1.42)) that 



(A/f : £ Xa ) > I{Xa\ B) £ [c — > c], 



(2.17) 
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while RSP (Eq. (1.38)) states that 



I(X A ;B) £ [c^c]+H(B)[qq] > (Aí £ : £ Xa ) . 



(2.18) 



In the presence of free entanglcment, these resource inequalities combine to become an equality: 



This remarkable fact can be thought of as a sort of reverse Shannon theorem for {c — » q} channels, 
stating that when entanglement is free (in contrast to the CRST, which requires free rbits), any 
{c — > q} channel on a fixed source is equivalent to an amount of classical communication given by its 
capacity. 

Recali the similar equality for partially entangled states in the presence of a sublinear amount of 
classical communication: {ip AB ) + o[c — ► c] = H(B) 1 f,[qq] + o[c — > c], By analogy with entanglement 
generation in Theorem 2.6, we will use the resource equality in Eq. (2.19) to derive a capacity theorem 
for classical communication in the presence of unlimited entanglement. 

Theorem 2.9. 



where the supremum is over all ensembles £ of the form in Eq. (2.14). 
The proof closely follows the proof of Theorem 2.6. 

Proof. We begin with the converse, proving that C^{U) < Axu- Alice and Bob begin with a fixed 
input state, which can be thought of as an ensemble £o with I(Xa; B) £ = 0. Local operations (which 
for simplicity, we can assume are all isomctrics) cannot increase I(Xa]B) 7 so after n uses of U the 
mutual information must be < nAxu- (For a gencralizcd and more formal vcrson of this argument, 
see the proof of Theorem 3.7 in Section 3.4.2.) The bound C^(U) < Axu then follows from Fannes' 
inequality. 

Coding theorem: For any ensemble £, we have (U) + I(Xa; B) £ [c — > c] + oo[gq] > (U) + (Aí £ : 
£ XA )+oo[qq] > {Muts) ■ £ Xji )+oo[qq] > I(Xa\ -B)c/(£) [c — > c]+oo[gq]. Using the Cancellation Lcmma 
(1.37) and taking the supremum over £, we find that (U) + o[c — > c] + oo[qç] > Axu[c — > c] + oo[çq]. 
Finally, we can use Proposition 2.5 and Lemma 1.36 to eliminate the sublinear classical communication 
cost. □ 

Although the coding theorem is formally very similar to the coding theorem for entanglement 
generation, its implcmcntation looks rather different. Achieving the bound in Theorem 2.6 is rather 
straightforward: 1) ri\ copies are created of some state ^i A 2-BiB 2 s ^ /^Ejj w H(B)ur^ — H(B)^,, 
2) U® ni is applied to V®" 1 , 3) entanglement is concentrated from (U\^))® ni , 4) ps n\H{B)^ ebits 
are used to recreate ip® ni and ps m(H (B)u^ — H{B)^) ps mAEu ebits are set aside as output, 
5) steps 2-4 are repeated times to make the cost of the catalyst vanish. The coding scheme 
for entanglemcnt-assistcd classical communication is similar, but has some additional complications 
becausc different parts of the message are not interchangcablc. The resulting protocol involves a 
peculiar preprocessing stcp in which Alice runs through the entire protocol backwards before U is 
used for the first time; for this reason, we call it the "looking-glass protocol." The procedurc is as 
follows: 

1. Choose an ensemble £ = J2íPí \i)(i\ with I{Xa\B)u^ £ ) — I(Xa;B)s ps Axu- 

2. The message is broken into ni blocks M 1; . . . , M ni , each of length Rm 2 Axu- Initialize R ni to 
be an arbitrary string of length kíU2I{Xa' 1 B)s. 



I{Xa ;B)s[c—* c] + oo[qq\ = {Ne : £ ) + oo [q q] . 



(2.19) 



CÜ{U) = A XU ■= sup [I(X A ; B) u{£) - I(X A ; B) £ ] 



(2.20) 



3. For k = ni, rii — 1, . . . , 1: 
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(a) Encode the string (Rf.,Mk) (w «2/(^4; bits) into an element of (C/(£))®™ 2 , say 
U\ip Xk L ) ® " • • ® U\ip Xk ) for some p-typical string x^ 2 . This is accomplished via HSW 
coding. 

(b) Alice now wishes to usc RSP to send \i/> x n 2) := \ip X ki) ® ' ' ' ® IV^* „ ) to Bob. She 
performs the RSP measurement on some síiared entanglement and obtains an outcome 
with wri2/(A J 4; B)g bits, which she doesn't send to Bob directly, but instead stores in the 
registcr Rk-i- 

4. Finally, Alice scnds i? to Bob using «^/(A^; -B),f[c — ► c]. 

5. For fc = 1, . . . , ni. 

(a) Bob uses Rk-i to perform his half of RSP and reconstruct his half of |V> X ™2). 

(b) Alice and Bob apply U n 2 times to obtain ïzU® U2 \ip x ™2). 

(c) Bob performs HSW decoding to obtain (M^, Rk) with a high probability of success. 

It might seem that errors and inefficiencies from the many HSW and RSP steps accumulate danger- 
ously over the many rounds of the looking-glass protocol. In [BHLS03], the protocol was carefully 
analyzed and the errors and inefncicncy were shown to converge to zero. However, the validity of the 
compositc protocol follows even more directly from the Composability Theorem (1.22); remarkably. 
this pcrmits a proof that is much more compact and intuitive than even the dcscription of the abovc 
protocol, let alone a verification of its correctness. 

As a corollary of Theorem 2.9, entanglement-assisted capacities are additive (i.e. C^(Ui <X> U 2 ) = 
C^(Ui) + C E \ r {U2))- The proof is basically the same as the proof that E(U) is additive. 

Another corollary we can obtain is an optimal coding theorem for entanglement-assisted one-way 
quantum communication: Q^(U) := max{Q : (U) +oo[qq] > Q[q — > q]} = Cj^(U)/2. This is because 
when entanglement is free, teleportation and super-dense coding imply that 2 cbits are equivalent to 
1 qubit. 

2.3.3 Relations between entanglement and classical communication capac- 
ities 

One of the most interesting properties of unitary gates as communication channels is that their 
diffcrent capacities appear to be closely related. In this section we prové that C+(U) < E(U) and 
then discuss some similar bounds. 

Proposition 2.10. // (C ls C 2 ,E) G CCE(Í7) then E(U) >C X +C 2 + E. 

Using the fact that back communication does not improve capacities (proved in the next chapter), 
we can improve this bound to E(U) > max(Ci, 0) + max(C2, 0) + E. 

This claim is significant for two reasons. First is that it implics that it may be easier to connect 
diffcrent unitary gate capacities than it has been to rclatc diffcrent capacities of noisy channels. It 
is directly uscful in finding gate capacities and raises the intriguing question of whcthcr the converso 
inequality of Proposition 2.5 (that E(U) > =>■ C^(U) > 0) can be strcngthcned, and ultimately 
whether C+{U) = E(U). 

The fact that C+(U) < E(U) has a deeper implication as wcll, which is that not all classical 
communication is created equal. While normally [c — > c] ^ [qq], a cbit sent through unitary means 
can be converted into entanglement. This suggests that using unitary gates to communicate gives us 
something stronger than classical bits; a resource that we will formally define in the next chapter as 
coherent bits or cobits. The consequences will be productive not only for the study of unitary gate 
capacities, but also for many other problems in quantum Shannon theory. 
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Proof of Proposition 2.10. Assume for now that E > 0. For any n, therc is a protocol V n that uses 
U n timcs to scnd c[ n) cbit(^) +C { 2 n) cbit(<-) and crcatc cbits with c[ n) > n{d - 5 n ), C { 2 ] > 
n(C 2 — 5 n ), > n(E — <5„) and error < e„, where <5 n , e„ — > as n — > oo . Wc analyze the protocol 
using the QP formalism, in which V n is an isometry such that for any a £ {0, l}* 7 ! ,6 £ {0, t 



\<Pab) ■= Pn\a}A 1 \b}B 1 



and F(\b) Al \a) Bl \fyMB 2 . ^ |íPaò>(¥'«fcUi, a , a B 1 . a ,, ) = 1 - e Q 6 > 1 - e n . (2.21) 

for some e a b < e n . By Uhlmann's Thcorcm[Uhl76], there exist normalized (though not necessarily 
orthogonal) states \^f a b) and \r) a b) satisfying 



\<Pab) = VI - (^\b)A 1 \a) Bl \$)A 2 n B 2 ha,b)A 3 B 3 + V^.\Vab) A 1 , 2 , 3 B lt2>3 - (2.22) 

Note that we have changed e a b to e„ by an appropriatc choice of \rjab)- This will simplify the analysis 
later. 

To generate entanglemcnt, Alice and Bob will apply V n to registers A\B\ that are maximally 
entangled with local ancillas A4B4; i.c. the states |$)^^ 4 = 2 _c, i ) I 2 J2 a l a )Ai|a)A 4 and \§) B ^ Bi — 
2~ c 2 '/ 2 J2b I^)biI^)b4- The resulting output state is 



WrÒAB = Vï ~ e n \l/j n )AB + V^I^AB, (2.23) 

where 

Wu)ab = 2-( c !" ) +^"')/2 \b) Al \a) Ai \a) Bl \b) Bi |$)|f^' \ lab ) A3 B 3 ■ (2.24) 

a.h 

A similar expression exists for \5 n ) A B, but it is not needed, so we omit it. Note that every Schmidt 
coeficient of \if> n ) is < cxp(-(cf l) + C { 2 n) + £ (n) )), so E(\^ n )) > c[ n) + C { 2 n) + E<- n \ 

We will use Fannes' inequality (Lemma 1.1) to relatc E(\lp n }) to E(\ip n }). From Eq. (2.23), we 
have |(^„|'0n)| > VI — e n- Applying the relation between fidelity and trace distance in Eq. (1.4), we 
find \\(p n — ip n \\i < 2 v / e^". Also, \ïp n ) was created with n uses of U, so Sch(|^5 n )) < (Sch(E/)) n < d 2n . 
Thus 

lEdVn»-^^™»! < (2nlogd)2V^ + r;(2^) 

E(\<p n )) > n (c 1 + C 2 + E-35 n ~4^\ogd- ??(2 ^ ) ") ( 2 .25) 



Therefore as n ^ oo, £-E(|^„)) -> G\ + C 2 + E. Since n(£/) > it follows that (U) > (d +C 2 + 

E)[qq}. 

We omit the quite similar proof of the E < case; however, note that this case also follows from 
the more general Theorem 3.1, which will be proved in Section 3.5. □ 

A similar bound exists for the entanglement-assisted capacity: Cf(U) < E(U) + E(W). This 
result is proved in [BS03a], though some preliminary steps are found in [BHLS03, BS03b]. Here we 
give a sketch of the argument and explain its evolution through [BS03b, BHLS03, BS03a]. 

As in Proposition 2.10, Alice and Bob will input halves of maximally entangled states into a com- 
munication protocol V n that uses U n times. This creates w nCf(U) ebits. However, the entangle- 
ment assistance leads to two additional complications. First, we need to bound the amount of entan- 
glement that V n uses to communicate. Say that V n starts with E^ n ' ebits. Then its entanglement con- 



sumption is no greater than max a ,b 



EM - E(V n \a)A\b) B \^)f B )\ < nE(W) (using AE L n = E(W) 

from Theorem 2.6). Here E(U*) can be thought of as an entanglement destroying capacity of U if 
we recognize that unitarily disentangling a state is a nonlocal task. For U £ U 2x2 , we always have 
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E(U) = E(U>), but for d > 2, numerical evidence suggests that equality no longer holds[CLS02]. 
Since V n uses no more than nE(U^) ebits, we have (U) + E(W)[qq] > Cf(U)[qq] and thus 
E(U) > Cf(U) - EiW), implying the desired result. More generally, for any (C x , C 2 , E) G CCE(Í7) 
this result implies that (Ci,C2,—E(U')) G CCE(Í7); i.e. more than E(W) ebits are never needed 
for any communication protocol. 

The argument outlined above follows the presentation of [BS03b]. However, we also need to address 
the second problem introduced by frec entanglcment. For the incmciency causcd by communication 
errors to vanish as in Proposition 2.10, we need to ensure that the logs of the Schmidt numbers of the 
states we work with grow at most linearly with n. Equivalently. we need to show that the parametcr 
from the previous paragraph can be chosen to be < Kn for some constant K . In [BHLS03], the 
explicit construction of Theorem 2.9 was used to achieve this bound for one-way communication, and 
thereby to prové the weaker result that 0^(11) < E(U) + E(U r ). 

Finally [BS03a] proves an exponential bound on Schmidt rank for general bidirectional protocols, 
by applying HSW coding in both directions to V n - Specifically, for any input of Bob's, Alice can 
consider V n to be a channel that communicates n(C± — S n ) bits with error < e„; such a channel has 
HSW capacity ps n(Ci — S n )(l — e n ) = nC\ — o(n). Similarly, Bob can code for a channel to Alice 
that has capacity nCi — o(n). These block codes require k blocks of V n with k 3> exp(n), but now 
the total error goes to zero as k — ■> oo, while the entanglement cost kE^ n ' grows linearly with k. So 
the desired capacity is achieved by taking k — > oo bcfore n. A refined version of this argument will 
be presented in the proof of Theorem 3.1 in Section 3.5. 

Technically, HSW coding is not quite appropriate here, since Alice's channel weakly depends on 
Bob's input and vice versa. Thus, a small modification of [BS03a]'s proof is necessary. The correct 
coding theorem to use for bidirectional channels was given in 1961 by Shannon[Sha61] and can be used 
to obtain the result claimed in [BS03a] (see also [CLL05] for a generalization of Shannon's 1961 result 
to noisy bidirectional quantum channels). Unlike the HSW theorem and Shannon's original noisy 
channel coding theorem [Sha48], the two-way coding theorem only achieves low average error instead 
of low maximum error. For entanglcment generation, average error is sufficicnt, but in the next 
chapter we will show (in Theorem 3.1) that maximum error can also be made small for bidirectional 
protocols. In fact, the average error and maximum error conditions appear to be asymptotically 
equivalent in general, given some mild assumptions[DW05b, CK81]. 

2.3.4 Challenges for bidirectional communication 

We conclude our discussion of classical communication using unitary gates in this section, by reviewing 
attempts to extend Theorem 2.6 to the case of bidirectional communication and pointing out the 
difficulties that arisc. 

There is no bidirectional analogue of HSW coding, even classically. In [Sha61], Shannon considers 
communication with noisy bidirectional channels — a model in some ways simpler, but in other ways 
more complex, than unitary gates — and establishes upper and lower bounds that do not always 
coincide. We briefly restate those bounds here. Define a bidirectional channel N(A out B ou t\A- m Bi n ) 
where A- m is Alice's input, B m is Bob's input, A out is Alice's output and B out is Bob's output. 
For any probability distribution on the inputs ^4j n i3i n , consider the rate pair I(Ai n ; B out \B- UÍ )[c — * 
c] + I(Bi n ; A out |A; n )[c «— c]. [Sha61] proves that this rate pair is 

• achievable if we maximize over product distributions on Ai n Bi n (i.e. I(Ai n ; B- Uí ) = 0) ; and 

• an upper bound if we maximize over arbitrary distributions on Ai n B m (i.e. if (N) > C\[c — > 
c] + C^yc <— c], then there exists a joint distribution on A- lri B m such that C\ = I{A{ n ; i? ut|Si n ) 
and C 2 = I(B in ;A out \A in )). 

Using the chain rule[CK81] we can rewrite these quantities suggestively as C\ = I(A in ; B in B out ) — 
I(Ai n ; Bi n ) and G<x = I(B in ; Ai n A out ) — I(A- m ; B in ). In this form, they resemble Eq. (2.20): for commu- 
nication from Alice to Bob we measure the difference between the output correlation I(A- m ; Bi n B out ) 
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and the input correlation I(A{ n ; B- ln ) and a similar expression holds for communication from Bob to 
Alice. This has led [BS03a] to conjecturo that a bidirectional version of Axu (defined in Eq. (2.20)) 
should describe thc two-way classical capacity of a unitary gate. However, even in the classical case, 
Shannon's inner and outer bounds on the capacity region (corresponding to uncorrelated or correlated 
inputs respectively) are in general different. 

This highlights the dimculties in coding for bidirectional channels. The messages both parties 
send may interfere with each other, either positively or negativcly. Thc best known protocols reduec 
the bidirectional channcl to a pair of onc-way channels for which Alice and Bob code indepcndcntly. 
However, we cannot rulc out the case in which Alice and Bob use correlated channel inputs to improve 
the rate. 

The same general concerns apply to quantum bidirectional channels, including unitary gates, 
although not all of the corresponding bounds have been proven. Some promising steps towards this 
goal are in [YDH05, Yar05], which derive capacity expressions for quantum channels with two inputs 
and one output. 

Reversible RSP is not possible for all bidirectional ensembles. The crucial ingredient in the proof 
of Theorem 2.9 was the equivalence for any ensemble £ (given unlimited entanglement) between 
the induced {c — > q} map Aíg and the Standard resource I(Xa; B)g[c — > c). Now suppose £ is a 
bidirectional ensemble J2ijPiqj \í)(í\ Xa <8> \j)(j\ YB <8> \*Pij) AB } which has a corresponding {cc — > qq} 
channel Aíg mapping \i) A \j) B to \tpij) . To extend Theorem 2.9 to the bidirectional case, we would 
begin by trying to find pairs (G\, C 2 ) such that (Aíg ) + oo [qq] = C\[c — ► c] + C2[c <— c] + oo [qq] . It turns 
out that there are ensembles for which no such equivalence exists. In fact, classical communication 
cannot reversibly simulate any ensemble whose classical capacity region is not just a rectangle. Thc 
proof of this is trivial: if (Aíg) +oo[qq] — C±[c — > c] + C 2 [c <— c] + oo[qq] then (Aíg) +oo[qq] > R\[c — > 
c] + R 2 [c^- c] if and only if i?i < d and R 2 < C 2 . 

One simple example of an ensemble that cannot be reversibly simulated is the ensemble corre- 
sponding to the AND channel: \ipij) AB = \i A j) A \i A j) B , where i,j G {0, 1} and i A j is the logical 
AND operation. Clearly (1,0) G CC(AND) and (0,1) G CC(AND); i.e. the AND ensemble can 
send one bit from Alice to Bob or one bit from Bob to Alice. (The channel is effectively classical, 
so we need not consider entanglement) If AND were reversibly simulatable, then we would expect 
(1, 1) G CC(AND). However, (l,e) CC(AND) for any e > 0. Suppose Bob sends zero with proba- 
bility p and one with probability 1 — p. When Bob sends zero, the channel output is |00) regardless 
of Alicc's input. Alice can only communicate to Bob during the 1—p fraction of time that he sends 
one, so she can only send 1—p bits to him. Thus we must have p = 0. Sincc Bob always sends one, 
he cannot communicate any information to Alice. 

One might object to the AND example by pointing out that simulating a rclative resource is a more 
reasonable goal, since the capacities of ensembles like AND vary with the probability distribution of 
Alice and Bob's inputs. In fact, even in the one-way case the HSW/RSP equivalence in Eq. (2.19) is 
only proven for relative resources.* However, one can construct ensembles where reversible simulation 
is impossible even if the probability distribution of the input is fixed. We construct one such ensemble 
(or channel) as follows: 

Alice and Bob both input m+ 1 bit messages, (oi, a 2 ) and (b±, b 2 ), where a± and b\ are single bits 
and a 2 and b 2 are m-bit strings. The channel Aí computes the following string: (ai © b\, (ai ffi bi?a 2 : 
b 2 )) and gives Alice and Bob both a copy of it. The notation (ai ffiòi?a2 : b 2 ) means that the channcl 
outputs a 2 if a± ffi òi = 1 and b 2 if ai ffi òi =0. We choose the input probability distributions to be 
uniform for both parties. Alice and Bob are allowed to agree upon any sort of block coding protocol 
thcy wish as long as they still send each input approximately the same number of times. 

First, we argue that (ra, 0), (0, ra) € CC(7V"). The protocol to achieve (m, 0) is as follows: Alice 
sets ai = for the first n/2 rounds and ai = 1 for the last n/2 rounds. Likcwisc, Bob sets b\ = 

*Actually, the quantum reverse Shannon thcorcm[BDH+05] gives a reversible simulation of unrelativized {c — > q} 
channels, though this appears not to be possible for general {q — > q} channels, or for the coherent version of {c — > q} 
channels that we will consider in the next chapter. 
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for the first n/2 rounds and 6 2 = 1 for the last n/2 rounds. The other two registers are set uniformly 
at random. This satisfies the criteria of pi and qj being uniform, although it is a very particular 
coding scheme. Since ai © b\ is always zero, it is always Alice's message ai which is broadcast to 
both parties. Thus, this transmits m bits to Bob per use of Aí. If Bob instead sets 61 = 1 for the 
first n/2 rounds and bi — for the last n/2 rounds, thcn the communication dircction is reversed. 

If Aí with uniformly distributcd inputs had a rectangular rate region, then (m, m) would also 
be achievable. However any achievable (Ci,C2) must satisfy G\ + C2 < m + 2, since there is a 
natural multi- round simulation for Aí that uses m + 2 total cbits. Choosing m > 2 yields a non- 
rectangular rate region and hence a channel that cannot be efficiently simulated, even with a fixed 
input probability distribution. 

Arguably, even this example does not go far enough, since we could talk about simulating Aí 
with respect to a bipartite test state p AB . However, it is hard to define a corresponding asymptotic 
resource; the natural choice of (Aí : p AB ) = (Aí® n : p^ n )^ =1 violates Eq. (1.11) since extra input 
test states p AB can no longer be created for free locally. On the other hand, (Aí : p A <%> p B ) is a 
wcll-dcfincd resource for which there may be a reversible simulation, but since it cannot contain any 
corrclations between Alice and Bob it is hard to imagino using it in a protocol analogous to the one 
in Theorem 2.9. 

Combincd, these facts mean that we are likely to need new methods and possibly new ways of 
thinking about resources to find the two-way capacity regions of unitary gates. 

2.4 Examples 

There are only a handful of examples of unitary gates where any capacities can bc computed cxactly. 
On the other hand, some more complicatcd gates appear to give separations between quantities likc 
and C«_ or C+ and E, though we will only be able to offer incomplete proofs for these claims. 
This section will describe what is known about the capacities of all of these examples. Many of the 
results on the two-qubit gates SWAP, CNOT and DCNOT are taken from [CLP01]. 

2.4.1 SWAP, CNOT and double CNOT 

We begin by reviewing three well- known gates in ^2x2- 

SWAP: The SWAP gate on two qubits is in a sense the strongest two qubit gate; i.e. for any 
two-qubit U, (swap) = [ç — ^ q] + [g < — ç] > (U) . The proof follows the lines of Proposition 2.8: any 
U can be simulated by sending Alice's input to Bob using [q — ► q], Bob performing U locally, and 
then Bob sending Alice's qubit back with [q <— q]. Thus, we would expect it to saturate all of the 
upper bounds we have found on capacities. 

In fact the capacity region is 

CCE(swap) = {(C U C 2 ,E) : d < 2, C 2 < 2,E < 2, max^!, 0) + max(C 2 , 0) + E < 2}. (2.26) 

The first two upper bounds follow from Proposition 2.8 and the last two upper bounds from 

max(Ci,0) + max(C 2 ,0) + E < ^(swap) < logSch(swAP) = 2. 

To show that this entire region is achievable, we can apply Proposition 2.10 to the singlo point 
(2, 2, —2) G CCE(swap). This in turn follows from applying super-dense coding in both directions to 
obtain SWAP + 2[qq] = [q -> q] + [qq] + [q <- q] + [qq] > 2[c -> c] + 2[c <- c]. 

There are more direct proofs for some of the other extreme points of the capacity region; the 
interested reader should try the exercise of finding a simple alternate proof of SWAP > 2[c — > c] (see 
Eq. (63) of [CLP01] for the answer). 

CNOT: It turns out that the capacity region of CNOT is exactly one half the size of the capacity 
region for swap: CCE(cnot) = {(C U C 2 ,E) : C x < 1, C 2 < 1, E < 1, max(C u 0) +max(C 2 , 0) +E < 
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1}. (Wc will later see that this is no accident but rather a conscqucncc of the asymptotic equivalcncc 

2(CNOT) = (SWAP).) 

The first three upper bounds follow from a simulation duc to Gottesman[Got99], 

[c^c] + [c^c} + [qq] > (cnot) (2.27) 

and causality. Then applying Proposition 2.10 yields the last bound. 

In terms of achievability, (cnot) > [c — > c] is obvious, and (cnot) > [c <— c] follows from 
(H <g> H)cnot(H <g> íf)|0}|6) = SWAP cnot SWAP |0)| fe) = |6)|6). Howevcr, just as the entire SWAP 
capacity region follows from (2,2,-2), the entire cnot region follows from the inequality cnot + 
[qq] > [ c —> c] + [c <— c], which is achicved by a protocol due to [CLP01]: 

(Z a H<S>I)CNOT{X a ® Z b )\<S> 2 ) AB = \b) A \a) B (2.28) 

Double CNOT: The double CNOT is formed by applying two CNOTs consecutivcly: first onc 
with Alice's qubit as control and Bob's as target, and then one with Bob's qubit as control and Alice's 
as target. Equivalently we can write dcnot = SWAP CNOT SWAP cnot. For a, b G {0,1}, we have 
DCNOT|a)|ò) = |6}|o® b). 

The double CNOT seems wcakcr than two uses of a cnot, but it turns out to have the same 
capacity region as the SWAP gate, or as (cnot) x2 : 

CCE(dcnot) = CCE(swap) = {(Ci,C 2! í;) : d < 2, C 2 < 2, E < 2, max(C*i, 0)+max(C 2 , 0)+E < 2}. 

(2.29) 

The upper bounds are the same as for SWAP, and achievability is shown in [CLP01]. Specifically, they 
give a protocol for the point (2, 2, —2) G CCE(dcnot), from which all other points follow. 

Relations among SWAP, CNOT and Double CNOT: If we were to judge the strengths of 
the SWAP, CNOT and DCNOT gates solely based on thcir capacity regions, then it would be reasonable 
to conclude that 

(SWAP) = 2(cnot) = (dcnot). (2.30) 

However, it has been historically difficult to construct efficient maps between these gates. [CLP01] 
has conjectured that 2CNOT ^ SWAP, and since 2cnot > DCNOT, this would imply that dcnot ^ 
SWAP. Moreover, [HVC02] shows that DCNOT is takes less time than SWAP to simulate using nonlocal 
Hamiltonians, implying that it somehow has less nonlocal power. A cute side effect of coherent 
classical communication, which we will introducc in the next chapter, will be a concise proof of 
Eq. (2.30), confirming the intuition obtaincd from capacity regions. 

Of course, this simple state of affairs appears to be the exception rather than the rule. We now 
consider two examples of gates whose capacity regions appear to be less well behaved. 



2.4.2 A gate for which C<_(£7) may be less than C^{U) 

In this section we introduce a gate txoxo G Udy.d that appears to have C<_(Í7) < C_ > ([/) when d is 
sufficiently large. Define í/xoxo as follows: 

E^xoxol^O) = \xx) V0 < x < d 
U-ííoxo\xx) = \x0) V0 < x < d 
Uxoxo\xy) = \xy) \fx ^ y ^ 

The first two lines are responsible for the gate's affectionate nickname, "XOXO." The d = 2 case 
corresponds to a CNOT, which is locally equivalent to a symmetric gate, though as d increases C/xoxo 
appears to be quite asymmetric. 
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Bounds on capacities for í/xoxo 

If Alicc inputs \a) and Bob inputs |0), then Bob will obtain a copy of Alicc's input a. Thus 
C^(U X oxo) >\ogd. 

Define S x e C{H B ) by 

f |0) Mx = y 
S x \y) = l\x) if = 2/ 
\y) otherwise 

Then Uxoxo = EJ X )( X \ ® 80 Sch(í/ XO xo) < d. Thus £(l/ X oxo) < logd. Combin- 
ing this with CU (í/xoxo) > logd yiclds logd < C_ (í/xoxo) < C+([/ X oxo) < £( í/xoxo) < 
log Sch([/xoxo) < logd. Thus these must all be equalities, and we have 

CU(^xoxo) = C+([/xoxo) = E (Uxoxo) = log Sch( í/xoxo) = logd 

These are the only capacities that know how to determine exactly. However, we can bound a few 
other capacities. 

Supposc Alicc and Bob sharc a d-dimensional maximally entangled state = ^Sj. I 
Using such a state Bob can communicate logd bits to Alice. The protocol is as follows. Let b <E 
{0, . . . ,d — 1} be the message Bob wants to send and let u = exp(27ri/d). First Bob applies the 
unitary transformation u I 1 ) ( x \ to his half of \&d), leaving them with the state J2 X u bx \x)\x). 
Then they apply the gate Uxoxo to obtain the product state ^= ^2 X Lü bx \x}\0). Alice can now apply 
the inverse Fourier transform A= J2 xy \ x )(y\ u ~ xy to recover Bob's message. 

Thus C^_(Uxoxo) > logd. This yields a lower bound for C<_(í/xoxo) as well, since one possible 
communication strategy for Bob is to use Uxoxo once to create a copy of |$<j) and a second time to 
send logd bits to Alice, using up the copy of 

So | logd < CV_(E/xoxo) < logd. We would like to know whether C_( í/xoxo) < CL, (í/xoxo) = 
log d. We cannot prové this expression asymptotically, but can show that if Alice and Bob share no 
cntanglement and are initially uncorrelated, Alice's mutual information with Bob's message is strictly 
less than log d after a single use of í/xoxo ■ 



Bounding the one-shot rate of í/xoxo 

Proposition 2.11. If Alice and Bob share no entanglement and input uncorrelated states into 
Uxoxo, Alice's mutual information with Bob's message is less than (1 — e)logd + 0(1) for some 
constant e > 0. 

Proof. Let a, f3, 7 be small positive parameters that we will choose later. 

Assume Alice begins with fixed input \ijj A ) A = Y^i 11 ^) 12j 2 where X^íl a i| 2 = 

5^. \Aij\ 2 — 1 and A denotes the composite Hilbcrt spacc AiA 2 . Let R Ç {0, ... ,d — 1} be the 
set given by 

R= {i : \a t \ 2 > a] . 

The normalization condition means that \R\ < 1/a. 

Bob will signal to Alice with some ensemble 8 = J2 x p x \x)(x\ Xb (E> \ip x )('4'x\ B ■ We will divide the 
indices x into three sets Si, S 2 and S3, according to various properties of the states \ip x )- Write one 
such state as b\ x '\i) Bl ~J2j B^\j) B 2 : where again B denotes the composite Hilbcrt space BiB 2 , 
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t^xoxo acts on A\B\ and A2B2 are ancilla systems. Now define Si, S2 and 5*3 by 

5 1 = {x:\b^\ 2 >0} (2.31) 

5 2 = L:^ ) | 2 </3and^|6^| 2 > 7 ) (2.32) 



S 3 = L:|6^| 2 </3and 5><*>| 2 < 7 f (2.33) 

l i&R ) 

Without loss of generality, we can introduce a second classical register for Bob, Y B , that records 
which of the S y the index x belongs to. If we also include Alice's fixed input state, then £ becomes 

£= E \y^y\ YB ® E i^i Xb ® im<m ab , (2.34) 

j/e{i,2,3} ies, 

where^x) := |^}|^)- 

Aftcr t/ := f/xoxo is applied, the parties arc lcft with the ensemblc U{£) :— (U AlBl <E> 
jX B iflA 2 B2^£^ rp ne mu ^ ua i Information of Alice's state with Bob's message is given by 

I{X B ;A) U{£) =I(X B Y B -A) u[e) = I(X B ;A\Y B ) U{£) + I(A;Y B ) U{£) 

<I(X B ;A\Y B ) u[£) +\og3< max I(X B ; A) u{£y) + log3. ( 2 ' 35 ) 

yG{l,2,3} 

Here we have defined the ensemble £ y , for y G {1, 2, 3} to bc the ensemble £ conditioned on Y B = y; 
i.c. 

^ :HE?x E fe M*!* 3 ® ly)(yr s ® • (2.36) 

Thus to prové our proposition it sufhces to verify that I(X B ; A)u( £y · ) < (1 — e) log d + 0(1) for each 
choicc of y. 

For cases y = 1, 2 we will use the following two facts. 

Fact 2.12. Leí p be a d-dimensional state and suppose that Trl·lp = p for some k-dimensional 
projector EL TTien measuring {II, / — 11} yields a state with entropy no greater than 

-fc|log|-(rf-fc)i-|logi-| =ií 2 (p)+ í3 logfc+(l-p)log(d-fc) < l+logfc+(l-p)logd. (2.37) 

Smce ÍT(p) < H {IlpTl + (1 - EI)p(l - II)) it follows that H{p) < (1 - p) log d + 1 + log fc. J/ we íreaí 
fc as a constant, then this is (1 — p) logd + 0(1). 

Fact 2.13. The mutual information of the output is bounded by entropy of Bob's input as follows: 

I(X B] A) u( e v) <I{X B iAB 1 ) u( e v )=I{X Bi AB 1 )e v <H(AB 1 )e y =H{B 1 ) Sy . (2.38) 

We can now prové that I(X B ; A)jj^ £l ^ < (1 — e)logd + 0(1). By the definition of S'í, we have 
(0\£ Bl \0) > (3. Now we use first Fact 2.13 and then Fact 2.12 (with the projector II = |0)(0| Sl ) to 
obtain 

I{X B ;A) u{ e l) <H{B 1 ) £l < (1 -/?)logd+ 1. (2.39) 

This last expression is < (1 — e) log d + 1 as long as e < (3. 

The case of y = 2 will yield to similar analysis. Define \i') G H B by \i') B = \i) Bl <g> Bij\j) B2 . 

Now define a projector U = J2 ieR \i'W\ B so that TrEI = \R\ and p := Tr(^|EI|<0f ) = J2 ieR \bi\ 2 . 
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NotethatTrII< í/a andp > 7. Nowwe can again use Facts 2.13 and 2.12 to bound I(Xb\ A) u ç£ 2 · ) < 
1 + log 1/a + (1 - 7) log d. This is < (1 - e) log d + 0(1) if we choose e < 7. 

Note that these two bounds are independent of U. They simply say that having a lot of weight 
in a small number of dimensions limits the potential for communication. Case S3 is the interesting 
case. Here we will argue that if Bob inputs a state that is not zero and does not match Alice's state 
well, he will not change Alice's state very much. 

Suppose a particular input state can be expressed as 

\i>) = a^A^B^ijkl)^ 3 ^ . 

According to the definition of 53, |òo| 2 < fi and YlieR |&i| 2 < 7j where R = {i : \üí\ 2 > a}. 
Aftcr one use of the nonlocal gate U, the new state is 

\ip') := U\tp) = |V') + ^ ^ aibiAikBií \iOkl) + aib A ik B ol \iikl) - aibiA lk B ü \iikl) - aib A ik B ok \iOkl) 

i^O k.l 

Writing \ip') in this form is uscful for bounding the state change 

\W) - l^)H 2 = E M'I^I 2 (M« - b o B oi\ 2 + \b B Q i - b t B ü \ 2 ) 

i,k,l 

= 2j2\ai\ 2 \kB ü -b B i\ 2 

i,i 

<4^|a I | 2 (|6 l | 2 | J B li | 2 + |feo| 2 |Bo i | 2 ) (2.40) 

i,l 

= 4^|a I | 2 (|6 l | 2 + |6o| 2 ) 

i 

= 4^|a í ò í | 2 +4^| aí ò í | 2 + 4|ò | 2 

where the inequality on the third line follows from the general bound \x — y\ 2 < (\x\ + \y\) 2 = 
2(x 2 +y 2 )-(\x\-\y\) 2 <2(x 2 +y 2 ). 

We can bound each of the three terms in Eq. (2.40) separately. First, 

5>ai 2 <En 2< ^ 

íer ieR 

The second term is 

E W < E ( E fan N a <«En 2 < «£ n 2 = a 

../7>' \.//,' / jÇÉR j 

The third term is simply |&o| 2 < fi. 

Thus || IV»') - |V')H 2 < 4(a + /3 + 7 ). In terms of fidelity, F{\ip), \ip')) = |(V#')| 2 > l-4(a + /3 + 7 ). 
Converting this to trace distance means that ^|| IV'XV'I ~~ l^'X^'l II 1 < 2-^/ (a + /3 + 7). Since this holds 
for each element of £3 and trace distance is convex it follows that i||£ 3 4 -£/(£3) A ||i < 2 v / (a + /3 + 7 ). 
Alice's system is initially in a pure state, so we can do a Schmidt decomposition between A and A' 
and thus assume that dim^4' = d. This also means that H(£^) = 0. Using Fannes' inequality then 
yields I(X B ;A) u{ s 3 ) <H{A) u(£ , d) < 4^/{a + fi + 7) • 2 log d + (log e)/e < (1 - e) log d + 0(1) as long 



as e < 1 - V(a + /3 + 7 ). 
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This proves our claim for any a, 7 > as long as e < j3, e < 7 and e < 1 — 8y/(a + (3 + 7). 
This clearly holds as long as a, /?, 7 and e arc small cnough. The largest value of e possible is 
vV33 - V32 « 0.2962, when a ps and /? = 7 = e. □ 

I suspect that the actual asymptotic capacity is closer to \ loge? + 0(1) for large vàlues of d, but 
more careful techniques will be required to prové this. 

2.4.3 A gate for which C + (U) may be less than E(U) 

Another separation that appears plausible is between the total classical capacity C+(U) and the 
cntanglement capacity E{U). In this section we present an example of a gate U for which it appears 
that C+(U) < E(U), though, as with the last section, we cannot actually prové this claim. 

The gate is defincd (for any d) as follows: U = I + |$ d }(01| + |01)($ d | - |01> <01| - |$d)($d|. 
Obviously, E{U) > logd, since U\0l) = \$d}- This inequality is not quitc tight (i.e. E(U) > logd 
and probably E(U) ps logd + O(l)), but this doesn't matter for the argument. 

I conjecture that Cf(U) — O(l) < \ogd for large d. Howcver, the only statement that can readily 
be proven is that, like the last section, a singlc use of U for one-way communication on uncorrelated 
product inputs can create strictly less than logcí bits of mutual information, for d sufficicntly large. 

The proof is actually almost identical to the proof of the last section, though slightly simpler. If 
Alice and Bob input product states, then the overlap of their states with \&d) is < l/Vd, so this 
portion of U has little effect. We divide Alice's signal ensemble into a part with a large |0) component 
(which has low entropy) and a part with a small |0) component (which is nearly unchanged by the 
action of U). As a result, the total amount of information that Alice can send to Bob (or that Bob 
can send to Alice) with one use of U, starting from uncorrelated product states, is strictly less than 
the entanglcmcnt capacity. However, this argument is far from strong enough to prové a separation 
between asymptotic capacities. 

2.5 Discussion 

We conclude this chapter by restating its key results and discussing some of the major open qüestions. 
Most of the gate capacities can be expressed in terms of the three-dimcnsional region CCE(Í7) := 
{(01,02,-E) : (U) > 0i [c -> c] + 2 [c^ c]+E[qq}}. The two coding thcorcms (2.6 and 2.9) establish 
that 

• max{£ : (U) > E[qq}} =: E{U) = AE V := sup v , H{B) UW - H(B) 4 , 

. max{0 : (U) + oo[gg] > C[c -> c]} =: G*(JJ) - A Xu ■= sup £ X (^ A U{£)) ~ x(Tr A 8) 
The key bounds (from Propositions 2.4, 2.5, 2.8 and 2.10) are 

• C+(U) < E{U) < logSch(í7) < 21ogd 

• Cf(U) < mm(Alogd,E(U)+E(W)) 

• £ ;(C/),0, E (Í7) < 21ogíí 

• E(U) > H(X)[qq] where {Ai} are Schmidt coeficients of U. 

• 0-(?7) ^ ^ C^(U) ^ <=^> E(U) ^ ^ Sch(f/) ± 1 ^ U is nonlocal. 
These results suggest a number of open qüestions. 

• Can we find an upper bound on the dimension of the ancillas A'B' that are needed for an optimal 
input state for entanglement generation? For entanglement-assisted classical communication, 
how large do the dimensions of A'B' need to be, and how many states are needed in the optimal 
ensemble? These are important for numerical studies of the capacities. 
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• Do there exist U such that C^(U) ^ C,_(L/)? Note that U cannot be a two-qubit gate since the 
decomposition in Eq. (2.1) implies that two-qubit gates have symmetric capacities. I conjecture 
that C^([/xoxo) 7^ C^([/xoxo) f° r l/xoxo defined as in Section 2.4.2. 

• Do there exist U such that C^(U) ^ C^_{U)1 All of the examples of gates in Section 2.4 satisfy 
U = IP , but unpublished work with Peter Shor proves that in this case the entanglement- 
assisted capacity regions are fully symmetric. It seems plausible that this situation would hold 
in general, but no proof or counterexample is known. 

• Do there exist U for which C+(U) < E(U)1 I conjecture that this inequality holds for the gate 
defined in Section 2.4.3. 

• Is E(U) = E(U*yi Both quantities relate to how entangling a nonlocal gate is. However, we can 
only prové the equality when U = U T , by using the fact E{U) = E{U*)* . This gcneralizes the 
proof in Ref. [BS03b] for 2-qubit gates since U = U T for all 2-qubit gates that are decomposed 
in the form of Eq. (2.1). Numcrical work suggests that the equality does not hold for some U 
in higher dimensions [CLS02]. Morc gencrally, we can ask whether CCE(Í7) = CCE(U>). 

• Is E(U) completely determined by the Schmidt coefficients of UI 

• It seems unlikely that classical capacity can be determined by Schmidt coefficients alone, but 
can we derive better lower and upper bounds on classical capacity based on the Schmidt coef- 
ficients of a gate? Specifically, can we show that Cf(U) < 21ogSch([7), or even better, that 
logSch(í7) ( [q — ► q] + [q <— q] ) > {U)l Right now these inequalities are only known to be true 
when Sch(í7) is maximal (i.e. equal to d^ds when U £ Ud A xd B )- 



*This is because max^, E(U\ip)) - E(\ip)) = max^, E(U\ip*)) - E(\ip*)) = max. t E(U*\ip}) - E(\ip)). 
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Chapter 3 

Coherent classical communication 



3.1 Introduction and definition 

One of the main differences between classical and quantum Shannon theory is the number of irre- 
versible, but optimal, resource transformations that exist in quantum Shannon theory. The highcst 
rate that ebits or cbits can be created from qubits is one-for-one: [q — > q] > [q q] and [q — > q] > [c — ► c] . 
But the best way to create qubits from cbits and ebits is teleportation: 2[c — » c] + [qq] > [q — ► q], 
These protocols are all asymptotically optimal — for example, the classical communication requirc- 
ment of teleportation cannot be decreased even if entanglement is free — but composing them is 
extremely wasteful: 3[q — > q] > 2 [c — > c] + [q q] > [q — > q]. This sort of irreversibility represents one 
of the main challenges of quantum information theory: resources may be qualitatively equivalent but 
quantitatively incomparable. 

In this chapter we will introduce a new primitive resource: the coherent bit or cobit. To emphasize 
its conncction with classical communication. we denote the asymptotic resource (defined below) by 
[c — > c].* Coherent classical communication will simplify and improvc a number of tòpics in quantum 
Shannon theory: 

• We will find that coherently decoupled cbits can be described more simply and naturally as 
cobits. 

• Rcplacing coherently decoupled cbits with cobits will make many resource transformations 
reversible. In particular, teleportation and super-dense coding become each other's inverses, a 
result previously only known when unlimited entanglement is allowed. 

• More generally, we find that many forms of irreversibility in quantum Shannon theory are 
equivalent to the simple map [c — ► c] > [c — > c] . 

• We will expand upon Proposition 2.10 to precisely explain how cbits are more powerful when 
they are sent through unitary means. This has a number of consequences for unitary gate 
capacitics. 

• In the next chapter, coherent classical communication will be used to relate many of the different 
protocols in quantum Shannon theory, give simple proofs of some existing protocols and create 
some entirely new protocols. These will allow us to determine two-dimensional tradeoff curves 
for the capacities of channels and states to create or consume cbits, ebits and qubits. 

Coherent classical communication can be defined in two ways, which we later show to be equivalent. 

*Other work[DHW05, Dev05b] uses [q — > qq] to denote cobits, in order to emphasize their central place among 
isometrics from A to AB. 
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3.2. SOURCES OF COHERENT CLASSICAL COMMUNICATION 



• Explícit definition in terms of finite resources: 

Fix a basis for C d : t»}^. First, we recali from Scction 1.2.1 thc definitions of 
quantum and classical communication: id^ = ^2 X \x) B (x\ A (a perfect quantum channel), 
idd = ^2 X \x) B \x) E (x\ A (a perfect classical channel in thc QP formalism) and = 
^ \x) A \x) B \x) E (x\ A (the classical copying operation in thc QP formalism). Then we define a 
perfect coherent channel as 

d-x 

A^^»^} 3 ^'. (3.1) 

x=0 

It can be thought of as a purification of a cbit in which Alicc controls the environment, as a 
sort of quantum analogue to a fcedback channel. The asymptotic resourec is then given by 
[c-c]:= (A 2 ). 

• Operational definition as an asymptotic resource: We can also define a cobit as a cbit sent 
through unitary, or more generally isometric, means. The approximate version of this statement 
is that whenever a protocol creates coherently decoupled cbits (cf. Definition 1.14), then a 
modified version of the protocol will create cobits. Later we will prové a precise form of this 
statement, known as "Rulc O," because it describes how output cbits should be made coherent. 

When C input cbits are coherently decoupled (cf. Definition 1.13) we instead find that replacing 
them with C cobits rcsults in C extra ebits being generated in the output. This input rule is 
known as "Rule I." Both rules arc proved in Section 3.5. 

The canonical example of coherent decoupling is when cbits are sent using a unitary gate. 
In Theorem 3.1, we show that cbits sent through unitary means can indeed be coherently 
decoupled, and thereby turned into cobits. 

The rest of the chapter is organized as follows. 
Section 3.2 will give some simple examples of how cobits can be obtained. 

Section 3.3 will then describe how to use coherent classical communication to make quantum pro- 
tocols reversible and more efficient. It will conclude with a precise statement of Rules I and 
O. 

Section 3.4 will apply these general principies to remote state preparation[BHL + 05], which leads 
to new protocols for super-dense coding of quantum state [HHL04] as well as many new results 
for unitary gate capacitics. 

Section 3.5 colleets some of the longer proofs from the chapter, in order to avoid interrupting the 
exposition of the rest of the chapter. 

Section 3.6 concludes with a brief discussion. 

Bibliographical note: Most of this chapter is based on [Har04], though in Section 3.5 the proofs 
of Rules I and O are from [DHW05] (joint work with Igor Devetak and Andreas Winter). and the 
proof of Theorem 3.1 is from [HL05] (joint work with Debbie Leung). 

3.2 Sources of coherent classical communication 

Qubits and cbits arise naturally from noiseless and dephasing channels respectively, and can be ob- 
tained from any noisy channel by appropriate coding [Hol98, SW97, Llo96, Sho02, Dev05a]. Similarly, 
we will show both a natural primitivc yiclding coherent bits and a coding theorem that can generate 
coherent bits from a broad class of unitary operations. 
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The simplest way to send a coherent messagc is by modifying super-dense coding (SD). In SD, 
Alice and Bob bcgin with |$2) and want to use id.2 to send a two bit message a\ai from Alice to Bob. 
Alice encodes her message by applying Z ai X a2 to hcr half of |$2) and then scnding it to Bob, who 
dccodes by applying (H (g> J)cnot to the state, obtaining 

(H (g> I)cNOT(Z ai X a2 <g) 7)|$2> = \ai) \a 2 ) 

Now modify this protocol so that Alice starts with a quantum state \a1a2) and applies Z ai X a2 to 
her half of | 2 ) conditioned on her quantum input. After she sends her qubit and Bob decodes, they 
will be left with the state \a\a 2 ) A \aia 2 ) B . Thus, 

[ï-9] + [?g] >2[c^c] (3.2) 

In fact, any unitary operation capable of classical communication is also capable of an equal 
amount of coherent classical communication, though in general this only holds asymptotically. The 
following thcorcm gives a general prcscription for obtaining coherent communication and proves part 
of the cquivalcncc of the two definitions of cobits given in the introduction. 

Theorem 3.1. For any bipartite unitary or isometry U , if 

(U)>Ci[c^c]+C 2 [c^c]+E[qq\ (3.3) 

for Ci, C 2 > and E € K then 

(U) >Ci[c->cJ + C a [c<-c]+£[gg] (3.4) 

If wc dchnc CoCoE(í7) = {(C 1 ,C 2 ,E) : (U) > d\c -> c] + C 2 \c <- c] + E[qq}}, then this thcorcm 
states that CCE(Í7) and C G E([7) coincide on the quadrant C\,C 2 > 0. 

Here we will prové only the case where C 2 = 0, deferring the full bidirectional proof to Section 3.5. 
By appropriate coding (as in [BS03a]), we can reduce the one-way case of Theorem 3.1 to the following 
coherent analogue of HSW coding. 

Lemma 3.2 (Coherent HSW). Given a PP ensemble of bipartite pure states 

\E) = Y,Vp-*\x) R \x) Xa \Í>*) AB (3-5) 
xex 

U S =J2 \*)(*\ Xa ® (3.6) 

X 

{Us : £ Xa ) > I{X A -B) £ [c - c] + H(B\X A )[qq}. (3.7) 

Proof. A slightly modified form of HSW coding (e.g. [Dcv05a]) holds that for any 5 > 0, e > and 
every n sufhciently large there exists a code C C S n with \C\ = exp(n(/(A^; B)s — 5)), a decoding 
POVM {£> C '»}c»ec with error < e and a type q with \\p — q\\i < \X\/n such that every codeword 
c™ := ci . . . c„ € C (corrcsponding to the state \^ C ^) AB := \^ Cl ) AlBl ■ ■ • \ipc n ) AnBn ) has type q (i.c. 
Vx, \{cj — x}\ = nq x ). By error < e, we mean that for any c 11 e C, (^j c »|(7 (8 Z? c )|V ; c™) > 1 — e. 

Using Neumark's Theorem[Pcr93], Bob can make his decoding POVM into a unitary operation 
Ud defined by ÏTd|0}|çí>) = J2 c n l c ")\/f Applying this to his half of a codeword \ip c n ) wm 

yield a state within e of |c")|^ c n), since measurements with nearly certain outeomes cause almost no 
disturbancc[Win99a] . 

The communication strategy begins by applying Ug to \c n )x A to obtain \c n ) XA \ip c ^) AB . Bob 
then decodes unitarily with Ud to yield a state within e of \c n ) XA \c n ) XB \^ c ^) AB . Since c™ is of 



and an isometry 



then 
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3.3. RULES FOR USING COHERENT CLASSICAL COMMUNICATION 



type q, Alice and Bob can coherently permute the states of \ip c n ) to obtain a state within e of 
\c n )x A \c n )x B \ipi)® nqi ■ ■ ■ \ip\x\)® m]x] ■ Thcn they can apply cntanglemcnt conccntration[BBPS96] 
to \ipi)® nqi ■ ■ ■ l^i^i}®" 91 * 1 to obtain ps jiH(B\Xa)s cbits without disturbing the coherent messagc 



This will be partially superseded by the full proof of Theorem 3.1. However, it is worth appreciat- 
ing the key ideas of the proof — making measurements coherent via Neumark's Theorem and finding 
a way to decouple ancillas shared by Alice and Bob — as they will appear again in the later proofs, 
but surrounded by more mathematical details. 

There are many cascs in which no ancillas are produced, so we do not need the assumptions 
of Lemma 3.2 that communication occurs in large blocks. For cxamplc, a CNOT can transmit one 
coherent bit from Alice to Bob or one coherent bit from Bob to Alice. Recali the protocol given in 
Eq. (2.28) for CNOT + [qq] > [c ->■ c] + [c <- c]: (Z a H ® 7)CNOT(X a ® Z b )|$ 2 ) AB = \b) A \a) B . This 
can be made coherent by conditioning the encoding on a quantum register \a) A \b) B , so that 

cnot + [q q] > [c c] + [c <- c] (3.8) 



3.3 Rules for using coherent classical communication 

By discarding her state after sending it, Alice can convert coherent communication into classical com- 
munication. so [c — > c] > [c — > c]. Alice can also generate entanglcment by inputting a superposition 
of messages (as in Proposition 2.10), so [c — > c] > [qq\. The true power of coherent communi- 
cation comes from pcrforming both tasks — classical communication and cntanglemcnt gcncration — 
simultancously. This is possible whenever the classical message sent is coherently decoupled, i.c. 
random and nearly independent of the other states at the end of the protocol. 

Teleportation [BBC + 93] satisfies these conditions, and indeed a coherent version has already 
been proposed in [BBC98]. Given an unknown quantum state \tp} A and an EPR pair \&2) AB , Alice 
begins coherent teleportation not by a Bell mcasurement on her two qubits but by unitarily rotating 
the Bell basis into the computational basis via a CNOT and Hadamard gate. This yields the 
state Vi) A X l Z : >\'4>) B ■ Using two coherent bits, Alice can send Bob a copy of her register to 

obtain | X^íj \ij) B X 1 Z^\tp) B . Bob's decoding step can now be made unitary, leaving the state 

(\$> 2 ) AB )® 2 \ip) B ■ In terms of resources, this can be summarizcd as: 2 [c — > c] + [q q] > [q — > q] + 2 [q q] . 
Canceling the ebits on both sides (possible since [c — > c] > [qq]) gives 2[c — > c] > [q — > q] + [qq]. 
Combining this relation with Eq. (3.2) yields the equality* 

2[c-c] = [ç-í] + [??]. (3.9) 

This has two important implications. First, teleportation and super-dense coding are reversible so 
long as all of the classical communication is lcft coherent. Sccond, cobits arc equivalent, as resources, 
to the existing resources of qubits and ebits. This means that we don't need to calculate quantitics 
such as the cobit capacity of a quantum channcl; coherent communication introduces a new tool for 
solving old problcms in quantum Shannon theory, and is not dircctly a sourec of new problcms. 

Another protocol that can be made coherent is Gottesman's method[Got99] for simulating a 
distributed CNOT using one ebit and one cbit in either direction. At first glance, this appears 
complctcly irreversible, since a CNOT can be used to send one cbit forward or backwards, or to 
create one ebit, but no more than one of these at a timc. 



*Our use of the Cancellation Lemma means that this equality is only asymptotically vàlid. [vE05] proves a single- 
shot version of this equality, but it requires that the two cobits be applied in series, with local unitary operations in 
betwecn. 
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Using coherent bits as inputs, though, allows the recovery of 2 ebits at the end of the protocol, 

* 

so [c — * c] + [c <— c] + [qq] > (cnot) + 2[qq], or using entanglement catalytically, [c — > c] + [c <— 
c] > (cnot) + [qq]. Combined with Eq. (2.28), this yields another equality: 

(cnot) + [9g] = [c^c] + [c^ c]. 

Another useful bipartitc unitary gate is SWAP, which we recali is equivalent to [q — > q] + [q <— g]. 
Applying Eq. (3.9) then yields 

2(CNOT) = l(SWAP) 

which cxplains the similar communication and entanglement capacities for these gates found in the 
last chapter. Prcviously, the most efhcient mcthods known to transform between these gates gave 
3 (cnot) > 1(swap) > 1(cnot). 

A similar argument can be applied to dcnot. Sincc (dcnot) + 2[qq] > 2[c — > c] + 2[c <— c], it 
follows (from Theorem 3.1 or direct examination) that (dcnot) +2[qq] > 2[c — > c] + 2[c <— c] and 
that (dcnot) > [g — > g] + [g <— g] = (swap). Combining this with Proposition 2.8, we find that 
(dcnot) = (swap), a surprising fact in light of the observation of [HVC02] that DCNOT is easier for 
some nonlocal Hamiltonians to simulate than swap. In fact, by the same argument, any gate in Udxd 
with Cf(U) = 41ogd must be equivalent to the dx d swap gate. 

The above examples give the flavor of when classical communication can be replaced by coherent 
communication (i.e. "made coherent.") In general, we require that the classical message be (almost) 
uniformly random and (almost) cohcrcntly dccouplcd from all other systems, including the environ- 
ment. This leads us to two general rules regarding making classical communication coherent. When 
coherently-decoupled cbits are in the input to a protocol, Rule I ("input") says that rcplacing them 
with cobits not only performs the protocol, but also has the side effect of generating entanglement. 
Rule O ("output") is simpler; it says that if a protocol outputs coherently-decoupled cbits, then it 
can be modified to instead output cobits. Once coherently decoupled cbits are replaced with cobits 
we can then use Eq. (3.9) to in turn replace cobits with qubits and ebits. Thus, while cobits are 
conceptually useful, we generally start and finish with protocols involving the Standard resources of 
cbits, ebits and qubits. 

Below we give formal statements of rules I and O, deferring thcir proofs till the end of the chapter. 
Theorem 3.3 (Rule I). //, for some quantum resources a, @ G 1Z, 

a + R[c^ c:t]> [3 

and the classical resource R[c — > c : r] is coherently decoupled then 

R R 
a+-[q^q]>P+-[qq]. 

Remark: This can be thought of as a coherent version of Lemma 1.38. 

The idea behind the proof is that replacing R[c — > c : r] with R\c — > c : t] then gives an extra 
output of R[qq], implying that a + R\c — > c : r] > (i + R[qq]. Then [c — ► c : r] can be replaced 
by |([<7 — > q] + [qq]) using Eq. (3.9) and Lemma 1.36. To prové this rigorously will require carefully 
accounting for the errors, which we will do in Section 3.5. 

Theorem 3.4 (Rule O). //, for some quantum resources a,(3 G 1Z, 

a>(3 + R[c^c] 
and the classical resource is decoupled with respect to the RI then 

R R 
a>(3+-[qq] + -[q^q]. 
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3.4. APPLICATIONS TO REMOTE STATE PREPARATION AND UNITARY GATE 

CAPACITIES 



Hcrc the proof is evcn simplcr: R[c — > c] in thc output is rcplaced with i?[c — > c], which is 
equivalent to ([q —> q] + [qq])- Again, the details are given in Section 3.5. 

In the next chapter, we will show how Rules I and O can be used to obtain a family of optimal 
protocols (and trade-off curves) for gencrating cbits, ebits and qubits from noisy channels and states. 
First, we show a simpler examplc of how a protocol can be made coherent in the next section. 

3.4 Applications to remote state preparation and unitary 
gate capacities 

3.4.1 Remote state preparation 

Remote state preparation (RSP) is the task of simulating a {c — > q} channel, usually using cbits 
and ebits. In this section, we show how RSP can be made coherent, not only by applying Rule I to 
the input cbits, but also by replacing the {c — > q} channel by a coherent version that will preserve 
superpositions of inputs. Finally, we will use this coherent version of RSP to derive the capacity of a 
unitary gate to send a classical messagc from Alicc to Bob while using/creating an arbitrary amount 
of entanglcmcnt. 

Begin by recalling from Section 1.4 our definition of RSP. Let £ = \ í )( í \ Xa ® \ipi)(ipi\ AB be 

an ensemble of bipartite states and Me : |i)(z|' Xyl — > \ï)(í\ Xa ® \^íMí\ AB the {c — > q} channel such 
that M{£ Xa ) = £. Thc main coding theorem of RSP[BHL+05] states that 

I(X A ;B) £ [c^c}+H(B) £ [qq] > (Me : £ Xa ). (3.10) 

We will show that the input cbits in Eq. (3.10) are coherently decoupled, so that according to 
Rulc I, replacing them with cobits will perform the protocol and return some entanglcment at the 
same time. This reduces the entanglement cost to H(B) — I(X A ; B) = H(B\Xa), so that 

I(X A : B) e \c c] + H(B\X A ) £ [qq] > (M £ : £ Xa ). (3.11) 

In fact, we can prové an even stronger statement, in which not only is the input coherently decoupled, 
but there is a sense in which the output is as well. Define a coherent analogue of Me , which we call 

U s , by 

Us = Y.\W\ XA ®Wi) AB - (3-12) 

i 

We also replace the QP ensemble S with the (PP formalism) pure state \£) given by 

\£) = J2VÏÏ\t) R \*) XA \^) AB - (3.13) 

i 

We will prové that 

I(X A : B) s {c -> c] + H(B\X A ) £ [qq] > (U £ : £ Xa ). (3.14) 

Since (Us ■ £ Xa ) > (Me '■ £ Xa ), this RI implies Eq. (3.11); in particular, the presence of the reference 
system R ensures that £ Xa is the same in both cases, even if the are not all orthogonal. Proving 
Eq. (3.14) will require careful examination of the protocol from [BHL + 05], so we defer the details 
until Section 3.5. 

Remark: An interesting special case is when H{A)s = 0, so that Alicc is preparing pure states in 
Bob's lab rather than entanglcd states. In this casc, H(B\X A )£ = and Eq. (3.11) becomes simply 

H(B)s{c^4> (U £ :£ Xa ). (3.15) 

Thus, if we say (following [BHL+05]) that Eq. (3.10) means that "1 cbit + 1 ebit > 1 remote 
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qubit," then Eq. (3.11) means that "1 cobit > 1 remote qubit." Here "n remote qubits" mean the 
ability of Alice to prepare an n-qubit state of her choice in Bob's lab, though we cannot readily 
dcfinc an asymptotic resource corresponding to this ability, since it would violate the quasi-i.i.d. 
condition (Eq. (1.11)). Despite not being formally defined as a resource, we can think of remote 
qubits as intcrmediate in strcngth betwcen qubits and cbits, just as cobits are; i.c. 1 qubit > 1 
remote qubit > 1 cbit. As resources intermediate between qubits and cbits, remote qubits and cobits 
have complementary attributes: remote qubits share with qubits the ability to transmit arbitrary 
pure states, though they cannot creatc entanglcment, whilc cobits can gencrate entanglement, but 
at first glance appear to only be ablc to faithfully transmit the computational basis states to Bob. 
Thus it is interesting that in fact 1 cobit > 1 remote qubit, and that (due to [BHL+05]) this map is 
optimal. 

Eq. (3.11) yields two other useful corollaries, which we state in the informal language of remote 
qubits. 

Corollary 3.5 (RSP capacity of unitary gates). If U is a unitary gate or isometry with (U) > 
C[c — ► c] then (U) > C remote qubits(— ►). 

Corollary 3.6. (Super-dense coding of quantum states) [q — > q] + [qq] > 2 remote qubits(— >) 

More formally, we could say that if H(B)s < C for an ensemble £, then (U) > (Us : £ Xa ), and 
similarly for Corollary 3.6. We can also express Corollary 3.6 entirely in terms of Standard resources 
as 

\I{X A \ B) e [q^q] + (H(B) e - \l{X A - B) £ ) [qq] > (U e : £ Xa )- (3.16) 

Though this last expression is not particularly attractivc, it turns out to be optimal, and in fact 
to give rise to optimal trade-offs for performing RSP with the three resources of cbits, ebits and 
qubits [AH03] (see also [AHSW04] for a singlc-shot version of the coding theorcm). We will find this 
pattern repeated many times in the next chapter; by making existing protocols coherent and using 
bàsic information-theoretic inequalities, we obtain a series of optimal tradeoff curves. 

Corollary 3.6 was first proven directly in [HHL04] (see also [AHSW04]) and in fact, finding an 
alternate proof was the original motivation for the idea of coherent classical communication. 

Coherent RSP: Now, we explore the consequences of the stronger version of coherent RSP in 
Eq. (3.14). Just as RSP and HSW coding reverse one another given frec entanglcment, coherent RSP 
(Eq. (3.14)) and coherent HSW coding (Lemma 3.2) reverse each other, even taking entanglcment 
into account. Combining them gives the powerful equality 

I{X A :B)slc^c\+H{B\X A )[qq} = (U £ :£ Xa ), (3.17) 

which improves the original RSP-HSW duality in Eq. (2.19) by eliminating the need for free entanglc- 
ment. This remarkable statement simultancously implies entanglcment concentration, entanglcment 
dilution, the HSW theorem and remote state preparation and super-dense coding of cntanglcd states.* 

3.4.2 One-way classical capacities of unitary gates 

Here we will use Eq. (3.17) to determine the capacity of a unitary gate V to simultaneously send a 
classical message and generate or consume entanglement at any finite rate. The proof idea is similar 
to one in Theorem 2.9; we will use the equivalence between (coherent) enscmblcs and Standard 
resources (cobits and ebits) to turn a one-shot improvement in mutual information and expected 
entanglcment into an asymptotically cfhcicnt protocol. Now that we have an improved version of the 
duality between RSP and HSW coding, we obtain a precise accounting of the amount of entanglement 
generated/consumed. 

*On the other hand, we had to use almost all of these statements in order to prové the result! Still it is nice to see 
them all unified in a single powerful equation. Also, recent work by Devetak[Dcv05b] further generalizes the equalities 
that can be stated about isometries from A to AB. 
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3.4. APPLICATIONS TO REMOTE STATE PREPARATION AND UNITARY GATE 

CAPACITIES 



Theorem 3.7. Define CE(V) := {{C,E) : (C,0,E) € CCE(V)} and 

A/,b(V) := {(C,E) : 3£ s.t. I(X A ;B) V{£) - I(X A ;B) e > C and H(B\X A ) V{£) - H(B\X A ) £ > E) , 

(3.18) 

where £ is an ensemble of bipartite pure states in AB conditioned on a classical register X A . 
Then CE(V) is equal to the closure o/A/^V). 

Thus the asymptotic capacity using — E ebits of assistance per use of V (or simultaneously out- 
putting E ebits) cquals the largest increase in mutual information possible with one use of V if the 
avcrage entanglement decreases by no more than — E. Theorem 2.9 provcd this for E = — oo and 
our proof here is quite similar. Note that the statement of the theorem is the same whether we 
consider QP ensembles £ or PP ensembles \£), though the proof will use the coherent version of RSP 
in Eq. (3.14). 



Proof. Coding theorem: Supposc there exists an ensemble £ with C = I(X A ; B)v(e) ~~ I{X A ; B) £ and 
E = H{B\X A ) V{£) - H(B\X A ) £ . Then 

(V) + (U S ) > (U v(£) ) 

> I(X A ; B) v(£) [c -> c] + H{B\X A ) V(£) [q q] 

> (l{X A ;B) v(£) - I{X A \B)s) [c-> cj + (H{B\X A ) V{£) - H(B\X A ) £ ) [qq]+ (U £ ) 

Here the second RI used coherent HSW coding (Lemma 3.2) and the third RI used coherent RSP 
(Eq. (3.14)). We now use the Cancellation Lemma to show that (V) > C[c — > c] + E[qq], implying 
that {C,0,E) g CCE(V). 

Converse: We will actually prové a strongcr rcsult, in which Bob is allowed unlimitcd classical 
communication to Alice. Thus, we will show that if (V) + oo[c <— c] > C[c — > c] + E[qq], then there 
is a sequence of ensembles {£„} with (I(X A ; B) v ^ , — I(X A ;B)^ , H(B\X A ) y ^ > — H(B\X A )g ) 
converging to (C, E) as n — * oo. This will imply that (C, E) is in the closure of A/ i e(V). 

Let Y :— YaXb the cumulative record of all of Bob's classical messages to Alice. Using the QP 
formalism, we assume without loss of gcncrality that Bob always transmits his full mcasurcmcnt 
outcome (cf. Section III of [HL04]) so that Alice and Bob always hold a pure state conditioned on 
X A Y\ i.c. H(AB\X A Y) = and H(A\X A Y) = H(B\X A Y) = I(A)BX A Y) = I(B)AX A Y). 

First consider the case when E > 0. Fix a protocol which uses V n times to communicate 
> n(C — S') bits and create > n(E — 5') ebits with error < e. They start with a product state £q for 
which I(X A ; B) £o — and H(B\X A ) £a = 0. Denote their state immediately after j uses of V by £ j. 
(Without loss of gcncrality, we assume that the n uses of V are applied serially.) Then by Lemma 1.2, 
I(X A ; B) £n > n(C - S) and H{B\X A Y) £n > n(E - 6) where S = 0(5 + e) -> as n -> oo. 

Now define the ensemble S„ = ÍJ2]Li \ÍÍ){H\ ZaZb ® v\£f BXAYAYB ). We think of X := 
X A Y A Z A as the message variable and B := BYbZb as representing Bob's system. We will prové 
that I(X;È) v ,g n) - I(X;È)g > C - 5 and that H(B\X) v[£n) - H(È\X)g n > E — 8. 

First consider the change in mutual information. Sincc Y A = Yb and Z A = Zb (as random 
variables), I(X;È) S = I(X A Y A Z A ; BY B Z B ) £n = I(X A ;B\YZ) En + H(YZ) è ^ and similarly when 

we replace £ n with V(£ n ). Sincc V doesn't act on Y or Z, we have H(YZ)g = H(YZ) V , £ s and 
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thus 

I(X; B) v{èn) - I(X; B)^ = I(X A - B\Y Z) v[èn) - I(X A ; B\Y Z)^ 
1 ™ 

= -J2HXa;B\Y) £j -I{X A ;B\Y) VH£j) 

3 = 1 

1 1 " 

= -(I(X A ;B\Y) Sn -I(X A] B\Y) £ J + -^2(l(X A ·,B\Y) £j _ 1 -I(X A ·,B\Y) VH£j) ) 
n n . =i 

1 ™ 

^ C - 5 + - E B^-i - B|y) vt( £ ,)) 

(3.19) 

Recali that going from £j-i to W(£,) involves local unitàries, a measurement by Bob and classical 
communication of the outcome from Bob to Alice. We claim that I(X A ; B\Y) does not increase 
under this process, meaning that the expression inside the sum on the last line is always nonnegative 
and that I(X; B) v ^ > — I(X; B)^ > C — S, implying our desired conclusion. To prové this, write 
I(X A ;B\Y) as I(X A ; BY) - I(X A ; Y). The I(X A ;BY) term is nonincreasing due to the data- 
processing incquality[SN96], while I(X A ;Y) can only increase since each round of communication 
only causes Y to grow. 

Now we examine the change in entanglement. 

H{B\X) v{ïn) -H{è\X) in = H{BY B Z B \X A Y A Z A ) v{èn) - H{BY B Z B \X A Y A Z A ) èn 
= H(B\X A YZ) v{ln) - H(B\X A YZ) ïn 

1 1 - 

= - (H(B\X A Y) £n - H(B\X A Y) £a ) + - ^ (H(B\X A Y) £] _ 1 - H{B\X A Y) v]{£j) ) 

j=i 

I " 

^ E - 5 + ~ E {H{B\X A Y) Sj _ 1 - H(B\X A Y) VH£]) ) 

II 3=1 

(3.20) 

We would like to show that this last term is positive, or equivalently that H(B\X A Y) is at least as large 
for £j-i as it is for V'(£j). This change from £j-\ to V'(£j) involves local unitàries, a measurement 
by Bob and another classical message from Bob to Alice, which we call Yj. Also, call the first j — 1 
messages F J_1 . Thus, we would like to show that H{B\X A Y^ 1 ) £] _ 1 - H{B\X A Y^- 1 Y ) V i (£]) > 0. 
This can be expressed as an average over H{B) £ ^ j _ 1 — H(B\Yj) V i^ £ whcre £j_\\ Xí yj-i 

indicates that we have conditioned £j—i on X A = x and Y 3 ~ l = y 3 ■ This last quantity is positive 
because of principle that the average entropy of states output from a projcctive measurement is no 
greater than the entropy of the original state[Nic99a]. Thus H(B\X) V ^ > — H(B\X)g > E — 5. 
As n — > oo, 5 — > 0, proving the thcorem. 

The case when E < is similar. We now begin with H(B\X A Y A ) £g < n(—E + S) = —n(E — 5) 
and since X A and Y are classical registers, fmish with H(B\X A Y A ) £n > 0. Thus H(B\X A Y A ) £n — 
H(B\X A Y A ) £o > n(E — S). The rest of the proof is the same as the E > case. □ 

3.4.3 Two-way cbit, cobit, qubit and ebit capacities of unitary gates 

So far we have two powerful results about unitary gate capacity regions: Theorem 3.1 relates CCE and 
CoQE in the d, C% > quadrant and Theorem 3.7 gives an expression for CE(U) in terms of a single 
use of U. Moreover, the proof of Theorem 3.7 also showed that backwards classical communication 
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cannot improve the forward capacity of a unitary gate. This allows us to extend Theorem 3.1 to 
Ci < or C 2 < as follows: 

Theorem 3.8. For arbitrary real numbers C\,C 2 ,E, 

{C 1 ,C 2 ,E) G CCE^ (C 1 ,C 2 ,E-mm(C 1 ,0)-min(C 2 ,0)) £ QQE . (3.21) 

This theorem is a direct conscquence of the following Lemma, which enumerates the relevant 
quadrants of the (61,62) planc. 

Lemma 3.9. For any bipartite unitary or isometry U and Ci,C 2 > 0, 

C 2 [c^c] + (U) > Ci[c-> c] +E[qq] iff (3.22) 

(U) > Ci[c-> c] +E[qq] iff (3.23) 

(U) > dlc^cj+Eiqq] iff (3.24) 

C 2 \c^c\ + {U) > dlc^cj + iE+C^lqq] (3.25) 



Ci[c-> c] + C 2 [c^- c] + (U) > E[qq] iff (3.26) 
(U) > E[qq] iff (3.27) 
dfc c] + C 2 [c «- c] + (C0 > (£+Ci+C 2 )[ç<z] (3.28) 

Basically, the rate at which Alice can send Bob cbits while consuming/gencrating cbits is not 
increased by (coherent) classical communication from Bob to Alice, except for a trivial gain of entan- 
glement when the assisting classical communication is coherent. 

Proof. Combining (TP) and coherent SD (Eq. (3.2)) yiclds 2[c -> c] + [qq] + [q -> q] + [qq] > [q -> 
q] + 2[c — > cj. Canccling the [q — > ç] from both sides and dividing by two gives us 

[c->c] + [qq]>[c->c]. (3.29) 

For the first part of the lemma, recali from the proof of Theorem 3.7 that free backcommunication 
does not improve the forward capacity of a gate. This means that Eq. (3.22) => Eq. (3.23). We obtain 
Eq. (3.23) Eq. (3.24) from Theorem 3.1 and Eq. (3.24) Eq. (3.25) follows from [c^c}> [qq] 
and composability (Theorem 1.22). Finally, Eq. (3.25) => Eq. (3.22) becausc of Eq. (3.29). 

For the second part of the theorem, Eq. (3.26) => Eq. (3.27) follows from Theorem 2.6, Eq. (3.27) 
=> Eq. (3.28) is trivial and Eq. (3.28) Eq. (3.26) is a consequence of Eq. (3.29). □ 

Quantum capacities of unitary gates: These techniques also allow us to determine the quantum 
capacities of unitary gates. Define QQE to be the region {(Qi,Q 2 , E) : U > Qi[q — > q] + Q 2 [q <— 
q] + E[qq]}, corresponding to two-way quantum communication. We can also consider coherent 
classical communication in one direction and quantum communication in the other; let QQE be the 
region {(Q 1; C 2 ,E) : U > Q^q -> q] + C 2 [c <- c] + E[qq}} and define QCQE similarly. 

As a warmup, we can use the equality 2[c — » c] = [q — > q] + [qq] to relate QE and QE, defined as 
QE = {(C, E) : (C, 0, E) G QQE} and QE = {(Q, E) : (Q, 0, E) e QQE}. We claim that 

(Q, £) £ QE o (2Q, E - Q) e QE ■ (3.30) 

To prové Eq. (3.30), choose any (Q,E) G QE. Then ({/) > Q[g -> ç] + S[çgr] = 2g[c cj + (E - 
Q)[qq], so (2Q,E-Q) G QE- Conversely, if (2Q,E-Q) G Q,E, then Í7 > 2Q[c -> c] + Q)[çg] = 
Q[q->q]+E[qq], so (Q,B) G QE. 
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Note that the above argument still works if we add the same resource, such as Q2V1 <— q], to the 
right hand side of each resource inequality. Therefore, the same argument that proved Eq. (3.30) also 
establishes the following equivalences for bidirectional rate regions: 

{Qi,Q 2 ,E) G QQE (2Q 1 ,Q 2 ,E-Q 1 ) G QCQE 

t t • (3-31) 

(Qi, 2Q 2 , E - Q 2 ) G QQE (2Qi, 2Q 2 , E—Qi— Q 2 ) G Q,QE 

Finally, Eq. (3.21) further relates QQE, QCE, CQE, CCE, wherc QCE and CQE are defined similarly 
to QQE and GoCQE but with incoherent classical communication instead. 

Thus once one of the capacity regions (say QCoE) is determined, all other capacity regions discussed 
above are determined. The main open problcm that remains is to find an efficicntly computable 
expression for part of this capacity region. Theorem 3.7 gives a formula for the one-way cbit/ebit 
tradeoff that involves only a single use of the unitary gate, but we still need upper bounds on the 
optimal ensemble size and ancilla dimension for it to be practical. 



3.5 Collected proofs 

In this section we give proofs that various protocols can be made coherent. We start with Rulcs I and 
O (from Section 3.3), which gave conditions for when coherently decoupled cbits could be replaced by 
cobits in asymptotic protocols. Then we show specifically how remote state preparation can be made 
coherent, proving Eq. (3.14). Finally, we show how two-way classical communication from unitary 
operations can be made coherent, and prové Theorem 3.1. 



3.5.1 Proof of Rule I 

In what follows we shall fix e and consider a sufficiently large n so that the protocol is e-valid, 
e 2 -decoupled and aceurate to within e. 

Whcnevcr the resource inequality fcaturcs [c — > c] in the input this means that Alice performs a 
von Neumann measurement on some subsystcm A\ of dimension D ps cxp(n(R + 8)), the outeome 
of which she sends to Bob, who then performs an unitary operation depending on the received 
information. 

Before Alice's von Neumann measurement, the joint state of A\ and the remaining quantum 
system Q is 

£\»> Al l&>> . 



where by e-validity 



5>*-.D- l |<e. 



Upon lcarning the measurement outeome x, Bob performs some unitary U x on his part of Q, almost 
decoupling it from x: 



^2p x \x)(x\®0' x 



^p x \x)(x\®é 



/ , 1 



where \0' x ) = U x \<f> x ) and 9 = ^2 x p x & x - To simplify the analysis, extend Q to a larger Hilbert 
space on which there exist purifications |#)(#| 2 and I^X^I 2 \9' x )(0' x \ such that (according to 
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Lemma 1.3) \\9 X - 0||i < 2\/\\6' x - 9 Thcn 



(3.32) 



where the second inequality uses the concavity of the square root. 

If Alice refrains from the measurement and instead sends A\ through a coherent channel, using 
n(R + 6) cobits, the resulting state is 

Y,VK\x) Al \x} B i\<j> x )Q. 

x 

Bob now performs the controlled unitary ^2 X \x)(x\ Bl <8> U Xi giving rise to 

\ T ) A ^Q = Y,V^\x) Al \x) B ^\e x )Q. 



We may assume, w.l.o.g., that (9\6 X ) is real and positive for all x, as this can be accomplished by 
either Alice or Bob via an a;-dependent global phase rotation. 

We now claim that \T) AlBl Q is closc to \§ D ) AlBl \T 



Indced 



(r|r)|0) = £. 



> 



E 



i 



(3.33) 



according to Eq. (1.4). To bound this, we split the sum into two. For the first term, we apply Eq. (1.4) 
to the diagonal density matrices J2 X P X \ X )( X \ anc ^ -C - 1 1^) {^1 to obtain 



(3.34) 



The second term is 



< 



< 



V- 

^ 2 

X 



P *~D 



< 2e. 



Putting this together, we find that 
and by Eq. (1.4), 



(T|r)|0) >l-3e 



\\T - &D ® 0\\i < V6e 
Finally, since tracing out subsystcms cannot increase trace distance, 

\\T AlBl < Vfk 

Thus, the total effect of replacing cbits cobits is the generation of a state close to This analysis 
ignores the fact that the cobits are only given up to an error e. However, due to the triangle inequality, 
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this only enters in as an additive factor, and the overall error of e + \Z&e is still asymptotically 
vanishing. Furthcrmorc, this mapping preserves the e-validity of the original protocol (with respect 
to the inputs of a) since all we have done to Alice's states is to add purifying systems and add phases, 
which w.l.o.g. we can assume are applied to these purifying systems. 
We have thus shown 

a + R[c^c\> p + R[qq]. 
Eq. (3.9) and Lemmas 1.36 and 1.37 give the desired result 

R R 

□ 



3.5.2 Proof of Rule O 



Again fix e and consider a sufficicntly large n so that the protocol is e- vàlid, e 2 -dccouplcd and aceurate 
to within e. Now the roles of Alice and Bob are somewhat interchanged. Alicc performs a unitary 
operation depending on the classical message x to be sent and Bob performs a von Ncumann mca- 
surement on some subsystem B\ which almost always succeeds in reproducing the message. Namcly, 
if we denote by p x >\ x the probability of outeome x' given Alice's message was x then, for sufficicntly 
large n, 

1 \ - 

X 

Again D = cxp(n(iï + 5)). Bcforc Bob's measurement, the state of B\ and the remaining quantum 
system Q is 

^2 \ZPí 7 kl x ') Sl l<?W) Q - 

x' 

Based on the outeome x' of his measurement, Bob performs some unitary U x > on Q, leaving the state 
of Q almost decoupled from xx': 



Y, D l Px'\x\x){x\ <g> \x')(x'\®6' xx , ~^2d- 1 Px , ]x \x)(x\ ® \x')(x'\®Í 



< e 1 



where \0' xx ,) = U x i\<j> xx i) and 9 = D 1 ^ xx i P x '\ x 0' xx , ■ Observe, as before, that we can use Lemma 1.3 
to extend Q so that 9^8 and 6 XX > 3 0' xx , are pure states, (9\9 XX ) is real and positive and ||0 X a:' — < 



2y \\0' xx , — 9 || i. Again we use the concavity of x — > y/x to bound 

D^J^PxlxPxx-eWi < D^J^Pxlx'Pxx' -0\\i < 2e. 



We now modify the protocol so that instead Alice performs coherent communication. Given a 
subsystem A\ in the state \x) Al she encodes via controlled unitary operations, yiclding 

\x) M Y.VP~^\ x ') Bl \^') Q - 

x' 

Bob refrains from measuring B\ and instead performs the controlled unitary , ^^(^'l^ 1 (g) U x ' 
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giving rise to 

lx) A llTx)BlQ = lx)Al (^^— W)Bl g %xi) q\ 

We claim that this is a good approximation for R\c — > c : r] + (6*), and according to the correctness 
of the original protocol, 9 is close to the output of /?„. To check this, suppose Alice inputs \§d) RAi 
into the communication protocol. We will compare the actual state 

\ T) RA lBl Q ._ D -\ \x) R \ X ) Al \T X ) B ^ 

X 

with the ideal state 

\$ G m) RAlBl ® \6) Q = D~\ J2 \^) R \x) M \x) B ' \6)Q. 

X 

Their inner product is 

> ^ $>*|* - ^ J2 5 II*** - % > (1 - e) - e = 1 - 2e 
Thus, we can apply Eq. (1.4) to show that 

\\r - *ghz ® o\\i < 2VI 

We have thus shown that 

a > 13 + R [q -> q : r] . 
Using Theorem 1.40 and Eq. (3.9) gives the desircd rcsult 

R , , i? 



o 



>/3 + i7[?9] + ir[?^«]· 



□ 



3.5.3 Proof of Coherent RSP (Eq. 3.14) 

To prové that RSP can be madc coherent, we review the proof of Eq. (3.10) from [BHL+05] and 
show how it needs to be modified. We will assume knowledge of typical and conditionally typical 
projectors; for background on them, as well as the operator Chcrnoff bound used in the proof, see 
[Win99a]. 

The (slightly modified) proof from [BHL+05] is as follows. Let £ = ^Pi \í)(í\ Xa ® ip AB be an 
ensemble of bipartite states, for which we would like to simulate Ns or Us- Alice is given a string 
i n = (ii, . . . ,i n ) and wants to prepare the joint state \ipi^) AB ■= \ipi 1 ) AB • • • |V'i„)" 4B - Let Qin be the 
cmpirical distribution of i n , i.e. the probability distribution on i obtained by sampling from i n . We 
assume that \\p — < 5, and since our simulation of Us will be used in some 77-valid protocol, we 

can do so with error < i] + cxp(— 0{nS 2 )). (Here 77, 5 — > as n — > 00.) Thus, the protocol begins by 
Alice projecting onto the set of i n with \\p — < S, in contrast with the protocol in [BHL+05], 

which begins by having Alice measure and send the result to Bob classically. 

Define ir£ B \i n s to ^ e * ne con ditionally typical projector for Bob's half of \tp^) AB , and let W£ B g 
be the typical projector for n copies of £ B . These projectors are defined in [Win99a], which also 
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proves that the subnormalized state 

|^„) = (i®n^n™ B|l „ 5 )|^), (3.35) 

satisfies (ip' in IV'ín) > 1 — 2e, whcrc 5, e — > as n — > oo. This implies that ||^jn — Afilli < 2 -y/e, wherc 
we define the normalized state : = IV'í™)/ \/ (V'í» IV'í") ■ We will now write in a way which 

suggests how to construct it. Let \^d) AB be a maximally entangled state with D := rankil^ B(5 
and = IIg B5 /D. (By contrast, [BHL+05] chooses $ to be a purification of II" s with a := 

T,xQi n ( x )^x ■) Then IV'í») can be written as (M in <g> where Tr M^„ Mjn = D" 1 ^ 1^)- 

Thus, Alice will apply a POVM composed of rescaled and rotated versions of to her half of |$d), 
and after transmitting the measurement outcome k to Bob, he can undo the rotation and obtain his 
half of the correct state. The cost of this procedure is log-D ebits and \ogK ebits, where we will latcr 
spccify the number of POVM outcomes K . 

We now sketch the proof that this is efhcient. From [Win99a], we find the bounds 

D = rankn^ B s < cxp (n(H(B) £ + S)) (3.36) 
Tr A |$„X$» | < cxp (-n(H(B\X A ) £ + 5)) U n £ n s (3.37) 

Combining these last two equations and Eq. (3.35) with the operator Chernoff bound[Win99a] means 
that there exist a set of unitàries Ui, . . . , Uk such that \ogK < ti(I(Xa; B)s +3<J+o(1)) and whenever 
\\Qv - p\\i < 8 we have 



(1-#<IV UM^E± < (1 + e) % (3.38) 



k=l 



These conditions mean that Alice can construct a POVM {A^ , . . . , A ( K ,Af n } with 

A { p := = ^=M in U* k 

'K(l + e)Tr M/„Mí> 



(3.39) 



k 



According to Eq. (3.38), the "fail" outcome has probability < 2e of occurring when Alice applies this 
POVM to half of |$£,). And since (U* k <g> t)\$ D ) = (1 <g> E^)|$d), if Alice scnds Bob the outcome k 
and Bob applies [/& then the residual state will be \4>ïn)- 

We now explain how to make the above procedure coherent. First observe that conditioncd on not 
observing the "fail" outcome, the residual state is completely independent of the classical message fc. 
Thus, we can apply Rule I. However, a variant of Rule O is also applicable, in that there is no need 
to assume the input \i n ) is a classical register. Again conditioning on success, the only record of i n 
in the final state is the output state {ip'/n}- Thus, if Alice performs the POVM 

A k \=Y^\i n ){i n \®A { p, (3.40) 

i n 

(with .Afaii defined similarly) and Bob decodes using 

^|fc)(fc|®£4 (3.41) 

then (conditioned on a successful measurement outcome) ^2 in tJpí™ \i n ) R \i n ) XA will be coherently 
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mapped to £\ n ■ s /p^\i n ) R \i n ) XA W^) AB . This achieves a simulation of (U £ : £ Xa ) using I(X A ;B) 
cobits and H{B) ebits. According to Rule I, the coherent communication returns I(Xa', B) ebits at 
the end of the protocol, bringing the net entanglcment cost down to H(B\Xa)- Thus we have proven 
Eq. (3.14). □ 



3.5.4 Proof of Theorem 3.1 

For ease of notation, we first consider the E = case, so our starting hypothesis is that (U) > C\ [c — * 
c] + C 2 [c <— c]. At the end of the proof we will return to the E ^ case. 

The definition of V n 

Formally, Eq. (3.3) indicatcs the existence of sequences of nonncgative real numbers {e n },{í„} sat- 
isfying e n , 5 n — > as n — > oo; a sequence of protocols V n = {V n ®W n ) U ■ ■ ■ U (Vi (g> W\) U (Vq® Wq), 
where Vj,Wj are local isometries that may also act on extra local ancilla systems, and sequences 
of integers C^' satisfying nC\ > > n(C\—5 n ), rtC 2 > C 2 ™' > n(C 2 — such that thc 

following success criterion holds. 

Let a G {0, ' and b 6 {0, 1} C2 be the respective messages of Alice and Bob. Let \(p a b) ■= 
'Pn(\a}A 1 \b)B 1 )- Note that \f a b) generally occupies a space of larger dimension than Ai®Bi since 
T'n may add local ancillas. To say that V n can transmit classical messages, we requirc that local 
measurements on \<p a b) can generate messages b' for Alice and a! for Bob according to a distribution 
Pr(a'b'\ab) such that 

V a ,h J2 3 I ?<a'b'\ab) ~ 6 a , a ,8 btb , \ < e n (3.42) 

where a', b' are summed over {0, ' and {0, l} c i ' respectively. Eq. (3.42) follows from applying 
our definition of a protocol to classical communication, taking the final state to be the distribution 
of the output classical messages. Since any measurement can be implcmcntcd as a joint unitary on 
the system and an added ancilla, up to a redefinition of V n , W n , we can assume 

Wab) :=P ra (|a)A 1 |6) Bl )=^|6 / )A 1 k / >B 1 l7:;t>A 2 B 2 (3.43) 

a',b' 

where the dimensions of Ai and Bi are interchanged by V n , and |7°/y) are subnormalized states with 
Pi(a'b'\ab) := (7°; b , b ,) satisfying Eq. (3.42). Thus, for each a,b most of the weight of \tp a b) is 
contained in the \ t erm ; corresponding to error-free transmission of thc messages. See Fig. I(a). 

The three main ideas for turning classical communication into coherent classical com- 
munication 

We first give an informal overview of the construction and the intuition behind it. For simplicity, 
consider the error-free term with I7"'*) in A 2 B 2 . To see why classical communication via unitary 
means should be equivalent to coherent classical communication, consider the special case when 
|7o'&)a2B 2 is independent of a, b. In this case, copying a, b to local ancilla systems A ,B before V n 
and discarding A 2 B 2 after V n leaves a state within trace distance e n of |6)ai \&)a l a )Bi |^)b — the 
desired coherent classical communication. See Fig. I(b). In general |7°'b)A 2 B 2 wu l carry information 
about a, ò, so tracing A 2 B 2 will break the coherence of the classical communication. Moreover, if 
the Schmidt coefRcients of |7°'6)a 2 b 2 depend on a, b, then knowing a, b is not sufncient to coherently 

eliminate \j2'b)M b 2 without some additional communication. The remainder of our proof is built 
around the need to coherently eliminate this ancilla. 
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Our first strategy is to encrypt the classical messages a, b by a shared key, in a manner that 
preserves coherence (similar to that in [Leu02] ) . The coherent version of a shared key is a maximally 
entangled state. Thus Alice and Bob (1) again copy their messages to A ,B , trien (2) encrypt, 
(3) apply Vni and (4) dccrypt. Encrypting the message makcs it possible to (5) almost dccouplc 
the message from the combincd "key-and-ancilla" system, which is approximately in a state |roo) 
independent of a, b (exact definitions will follow latcr). (6) Tracing out |r o) gives the desired coherent 
communication. Let V' n denote steps (l)-(5) (see Fig. I(c)). 



(a) 



Aila) 



b') 



Bi|6>— I 1- W) 



(b) 

Ao 
Ai 

Bi 





a 

oyè— 



- w 

- w \ 

- a') 

- b) 




Figure 3-1: Schematic diagrams for V n and V' n . (a) A given protocol V n for two-way classical 
communication. The output is a superposition (over all a',b') of the depicted states, with most of the 
weight in the (a',ò') = (a,b) term. The unlabeled output systems in the state \j^l b i) are A2,B2. (b) 

The same protocol with the inputs copied to local ancillas An,Bo before V n . If [j^'b) is independent 
of a,b, two-way coherent classical communication is achieved. (c) The five steps of V' n . Steps (l)-(4) 
are shown in sòlid lines. Again, the inputs are copied to local ancillas, but V n is used on messages 
encrypted by a coherent one-time-pad (the input \o)a 1 is encrypted by the coherent version of the 
key \x)a 3 and the output \a' ® x)bi is decrypted by |£)b 3 ; similarly, 16)3! is encrypted by |j/)b 4 and 
\b' (B y) Ai decrypted by |j/)a 4 - The intcrmcdiate state is shown in the diagram. Stcp (5), shown in 
dotted lines, decouples the messages in A ,i,B .i from A 2i 3 i4 , 62^4, which is in the joint state very 
close to |r o) ■ 



If entanglcment were free, then our proof of Theorem 3.1 would be finished. However, we have 
borrowed C^+C^ ebits as the encryption key and replaced it with |r o) ■ Though the entropy of 
cntanglement has not decreased (by any significant amount), |Too) is not directly usable in subsequent 
runs of V' n . To address this problem, we use a second strategy of running k copies of V' n in parallel 
and performing entanglement concentration of |roo)® fe . For sufficiently large k, with high probability, 
we recover most of the starting ebits. The regenerated ebits can be used for more iterations of V'® k 
to offset the cost of making the initial k ) ebits, without the nced of borrowing from 

anywhere. 

However, a technical problem arises with simple repetition of V n , which is that errors aceu- 
mulate. In particular, a naïve application of the triangle inequality gives an error ke n but fe, 
n are not independent. In fact, the entanglement concentration procedure of [BBPS96] requircs 
k 3> Sch(|roo)) = exp(0(n)) and we cannot guarantee that ke n — > as fe, n — > 00. Our third strategy 
is to treat the k uses of V' n as fe uses of a slightly noisy channel, and encode only l messages (each 
having C[ n \ bits in the two directions) using classical error correcting codes. The error ratc 
then vanishes with a negligible reduction in the communication rate and now making no assumption 
about how quickly e„ approachcs zero. We will see how related errors in decoupling and entanglement 
concentration are suppressed. 

We now describe the construction and analyze the error in detail. 
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The deflnition of V' n 

0. Alice and Bob begin with inputs |<i)aiI&)bi and the entangled states |$) A B and |$) A g ■ 



(Systems 3 and 4 hold the two separate kcys for the two mcssages a and b.) Thc initial statc 
can then be written as 

= \ xx )a 3 b 3 \yy)A 4 b 4 \a) Al \b) Bl (3.44) 



'N 

j> y 

where x and y are summed over {0, 1}°?° and {0, 1}°^, and N = exp ( c[ n) +C% l) ) . 

1. They coherently copy the messages to Ao, Bo- 

2. They encrypt the messages using the onc-time-pad |a)A 1 |^)A 3 — > \ct ffl x)aJx)a 3 and 
\b)B 1 \y)B 4 -> \b ® y)B 1 \v)B 4 coherently to obtain 

l a >Aol^>Bo ~7^J2\x)A 3 \y)A 4 \x)B 3 \y)B 4 |ffl © x) Al |6© v)b 1 • (3.45) 



N 

xy 

3. Using U n times, they apply V n to registers Ai and Bi, obtaining an output state 

|a>Ao|i»>B -^$^|a;>A3lí/>A4N>B3l2/>B4 \b' ® y) A 1 \a ® x) Bl \ll^ b b ^ v v ) k 2 b 2 ■ (3.46) 

* xy a',6' 

4. Alice decrypts her message in Ai using her key A4 and Bob decrypts Bi using B3 coherently 
as \b' ® y) Al \v) A 4 -> \b') Al \y)A 4 and \a' ® x)b 1 \ x )b 3 -> IüObJx^ producing a state 

| a ) Ao |&) Bo -L £ \x)A 3 \y)A 4 \x)B 3 \y)B 4 E I&%>%J7:SPa 2 b 2 • (3.47) 

* a',6' 

5. Further CNOTs Ai — > A4, Ao — > A3, Bi — > B3 and Bo — ► B4 will leave A2,3,4 and 62,3,4 almost 
decoupled from the classical messages. To see this, the state has become 

Na |6) Bo Yl l Ò ') Al l fl ') B i 7X7 £ 1° ® ® X )Ba l & ' © 2/>A> © 2/)b 4 l7a'e",'b'©pA 2 B 2 

= |a)A r »B X] l & ')Ail a ')Bi l r aea',6©6')A 2 ,3,4B 2 , 3 ,4 > (3-48) 
a',6' 



wherc 



.« := ^El a ® X > A 3l a '® X )B 3 l & '©y)A4^©y>B 4 l7X;6'D A 2 B 2 - ( 3 - 



N 

xy 



The fact |r a © a ' {,©(,') depends only onoffia' and 6 © 6', without any other dcpendcncc on a and 
6, can be easily seen by replacing x, y with a © x, ò © y in m the RHS of the above. Note 
that (r aea / i f,©&'|r a ©a',&©6') = i Pr (°' © x > V © 2/ 1 a © x, 6 © y), so in particular for the state 
corresponding to the error-free term, we have (roolToo) = Ylxy Pr(xy|xy) := 1 — e n > 1 — e n .* 



*Thus it turns out that Eq. (3.42) was more than we needed; the average error (over all a, b) would have been 
sufficicnt. In general, this argument shows that using shared entanglement (or randomness in the case of classical 
communication) can convert an average error condition into a maximum error condition, and will be further dcvclopcd 
in [DW05b]. 
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Suppose that Alice and Bob could project onto the space where a! = a and V = b, and teli 
each other they have succeeded (by using a little extra communication) ; then the resulting 
ancilla state -^==|r o) has at lcast c[ n ^+ Cj + log(l— e„) ebits, since its largcst Schmidt 

coefhcient is < [ exp^^+C^Xl— e„) ] ^ and e„ < e n (cf. Proposition 2.10). Furthermore, 
iToo) is manifcstly independent of a, b. We will see how to improve the probability of successful 
projection onto the error free subspace by using block codes for error correction, and how correct 
copies of |Too) can be idcntificd if Alice and Bob can exchangc a small amount of information. 

Main idea on how to perform error correction 

As discussed before, |r o) cannot be used directly as an encryption key - our use of entanglement in 
V' n is not catalytic. Entanglement concentration of many copies of |Too) obtained from many runs of 
T' n will make the entanglement overhead for the one-time-pad negligible, but errors will accumulate. 
The idea is to suppress the errors in many uses of V' n by error correction. This has to be done with 
care, since we need to simultaneously ensure low enough error rates in both the classical messagc 
and the state to be concentrated, as well as sufHcient decoupling of the classical messages from other 
systems. 

Our error-corrected scheme will have k parallel uses of V n , but the k inputs are chosen to be a 
vàlid codeword of an error correcting code. Furthermore, for each use of V' n , the state in A 2 ,3,4 62,3,4 
will only be collected for entanglement concentration if the error syndrome is trivial for that use of 
V' n . We use the fact that errors oceur rarely (at a rate of e„, which goes to zero as n — > 00) to show 
that (1) most states are still used for concentration, and (2) communicating the indices of the states 
with non trivial error syndrome requircs a negligible amount of communication. 

Definition of V'^ k : error corrected version of (V' n ) with entanglement concentration 

We construct two codes, one used by Alice to signal to Bob and one from Bob to Alice. We consider 
high distance codes. The distance of a code is the minimum Hamming distance between any two 
codewords, i.e. the number of positions in which they are different. 

First consider the code used by Alice. Let Ni = 2 c i . Alice is coding for a channel that 
takes input symbols from [iVi] := {1, . . . , N\} and has probability < e„ of error on any input (the 
error rate depends on both a and b). We would like to encode [iVi]' in [N{\ k using a code with 
distance 2ka n , where a n is a parameter that will be chosen later. Such a code can correct up to 
any \_ka n — errors (without causing much problem, we just say that the code correets ka n errors). 

Using Standard arguments*, we can construct such a code with l > k[ 1— 2a n — H2(2a n ) / c[ n ^ ], 
where Ü2{p) = — P^ogp — (1— p) log(l— p) is the binary entropy. The code used by Bob is chosen 
similar ly, with N2 = 2 C 2 input symbols to each use of V' n . For simplicity, Alice's and Bob's codes 
share the same vàlues of l, k and a n . We choosc a n > maxfl/Cj , l/Cj ) so that l > fc(l— 3a n ). 

Furthermore, we want the probability of having > ka n errors to be vanishingly small. This 
probability is < exp(— kD(a n \\e n )) < exp(/c + fca n loge n ) (using arguments from [CT91]) < exp(— k) 
if a n > -2/loge„. 

Using these codes, Alice and Bob construct V'^ k as follows (with steps 1-3 performed cohcrently). 

0. Let (a°, • ■ • , a°) be a vector of l messages each of c[ n ^ bits, and (6°, • • ■ , b°) be l messages each 
of C ( 2 n) bits. 

*We show thc existence of a maximal code by repeatedly adding new codcwords that have distance > 2ka n from 
all other chosen codewords. This gives at least N k / Vol(7V, 2ka n , k) codewords, where Vol(A r , kó, k) is the numbcr 
of words in [N] k within a distance kS of a fixed codeword. But Vol(N,kS,k) < (^ s )N kS < 2 kH ^ N kS . (See 
[CT91] or Eq. (6.4) later in this thesis for a derivation of ( fc fe a ) < 2 kH2 ^ s \) Altogcthcr, the number of codewords 
:= N l > N k /(2 kH ^ 2a r l )N 2ka ^), thus l > k [ 1 - 2a n - " 2 l ^ a N rl) }■ 
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1. Using her error correcting code, Alice encodes (a°, • • • , af) in a vàlid codeword a = (a\, • • • , a^) 
which is a k- vector. Similarly, Bob gencrates a vàlid codeword b = (61, • ■ ■ , ò&) using his code. 

2. Let Ài := Af k denote a tensor product of k input spaces each of C-J qubits. Similarly, 
Bi := Bf k . (Wc will also denote k copies of A 0j 2,3,4, and B 0i 2,3,4 by adding the vector symbol.) 
Alice and Bob apply (J-" n )® k to |&)b ; that IS > m parallel, they apply V' n to each pair of 
inputs (dj, bj). The resulting state is a tensor product of states of the form given by Eq. (3.48): 



3=1 



\aj)A \bj)~B ^2 I & j)aiK') b i l r a J e^,ò 3 e^)A 2 ,3,4B 2i3 , 4 • (3-50) 



Define |r seS , 5eï ,) À234 g 234 := <g)* =1 \T aj ®a' J ,b :i ®b' j }A 2:3A B 2i3A - Thcn, Eq. (3.50) can be written 
more succinctly as 

I^Àol^Bo E \^A 1 \ 3f )§ 1 \ T a S a'm^A 2M É 234 ■ ( 3 ' 51 ) 



3. Alice performs the error correction step on Ai and Bob does the same on Bi. According to our 
code constructions, this (joint) step fails with probability pf a n < 2 • 2~ k . (We will see below 
why pfaii is independent of a and b.) 

In order to describe the residual state, we now introduce Ga = {x£ [Ni] k : \x\ < ka n } and 
Gb ={í€ [A^ 2 ] fc : |af| < ka n }, where |x| := \{j : Xj ^ 0}| denotes the Hamming weight of x. Thus 
Ga,b are sets of correctable (good) errors, in the sense that there exist local decoding isometries 
T>a,T>b such that for any code word a £ [N\] k we have Va' ÉÍffl É/A,2?A|a') = |a)|a© a') (and 
similarly, if b £ [N2] k is a codeword, then Vò' £ b® Gb^bIU) = \b)\b® b')). For concreteness, 
let the decoding maps take Ai to A1A5 and Bi to B1B5. 

Conditioned on success, Alice and Bob are left with 

1 \a,b}t \a,b)s V V 16 ffi b') s la © a') fí |r_, „ (3.52) 

/I ., 1 ' 'Ao,i' ' 'B ,i A^i A^i 1 'A 5 l ^ 'B 5 I a©a', b®6'' A234B2W ' 

a'6a©Ç A &'gò©Ç B 



where we have defined a" := a © a' and 6" := 6 © b'. Note that 2~ fc+1 > pt ai i = 
6")^e A xe B ^S" 5" l^a" 6")' w hi c h i s manifestly independent of a, 6. The ancilla is now 
completely decoupled from the message, resulting in coherent classical communication. The 
only remaining issue is recovering entanglement from the ancilla, so for the remainder of the 
protocol we ignore the now decoupled states |a, i \a, 6)g Q i . 

4. For any x, define S(x) := {j : Xj ^ 0} to be set of positions where x is nonzero. If x £ Ga (or Gb), 
thcn \S(x)\ < ka n . Thus, S{x) can be written using < log^<fe a „(i) - lo S (kaj + lo g( fca ") ^ 
kH 2 (a n ) +log(fca„) bits. 

The next step is for Alice to compute \S(b")) from \b") and communicate it to Bob using 
(kH 2 (a„) + log(fca„)) [c — * c]. Similarly, Bob sends \S(a")) to Alice using (fciJ 2 (an) + 
log(fca„)) [c <— c]. Here we need to assume that some (possibly inefficient) protocol to send 
0{k) bits in either direction with error exp(— k — 1) (chosen for convenience) and with Rk uses 
of U for some constant R. Such a protocol was given by Proposition 2.5 and the bound on the 
error can be obtained from the HSW theorem[Hol98, SW97, HN03]. 
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Alice and Bob now have the state 

-=L= £ E \S(3f')S(b")h e \b"h 5 1 WS^I^M^. (3-54) 

a"<EÇ A b"GQ B 

Conditioning on their knowledge of S(a"), S(b"), Alice and Bob can now idcntify k' > k(l — 2a n ) 
positions where a'j = b" = 0, and extract k' copies of -^j==|r'oo)· Note that leaking 

S(a"), S(b") to the environment will not affect the extraction procedure, therefore, coherent 
computation and communication of S(a"), S(b") is unnecessary. (We have not explicitly in- 
cluded the environment 's copy of \S(a")S(b")) in the equations to minimize clutter.) After 
extracting kl copies of -— L==|roo), we can safely discard the remainder of the state, which is 

now completely decoupled from both [ ^i- Pí l-^oo) ] ^ an d the message |o)a I&)ai |&)b |o)bi ■ 



5. Alice and Bob pcrform cntanglemcnt conccntration £ conc (using the tcchniqucs of [BBPS96]) 
on [ v / 1 J L pf \^oo) ] ^ k ■ Note that sincc -^====\r 00 ) can be created using U n times and then 
using classical communication and postselection, it must have Schmidt rank < Sch(í7)™, where 
Sch(t/) is the Schmidt numbcr of the gate U. Also recali that E [ 1 = \T 00 ) ] > c[ n) + C^ n) + 
log(l— e„). According to [BBPS96], f conc requires no communication and with probability 
> 1 - exp [ -Sch(L/) n (VW - log(fc'+l)) ] produccs at lcast k' [ c[ n) +d 2 n) + log(l-e„) ] - 
Sch(C/) n [ VW-log(k'+l) ] ebits. 



Error and resource accounting 

consumes a total of 

(0) nk uses of U (in the k executions of V'^) 

(1) Rk uses of U (for communicating nontrivial syndrome locations) 

(2) k [c[ n ^ +0^] [qq] (for the encryption of classical messages). 
produces, with probability and fidclity no less than 

!_ 2 -(*-i)_2-(*-i) - exp [-Sch(E/) n (VW- log(Jfc'+l))] , 

at least 

(1) lCÍ n) lc-,cj+lC { 2 n) lc^cj 

(2) k' (c{ n) +C^ n) +log{l-e n )) - Sch(í/) n (VF-log(fc'+l)) [qq] . 

We restate the constraints on the above parameters: e n ,S n — > as n — > oo; C 1 ™' 1 > n(C±— 6 n ), 
Ct ] > n{C 2 -6 n ); a n > max(l/c[ n) , l/c| n) , -2/logc»); k' > fc(l-2a n ); l > k{l-3a n ). 

We dcfine "error" to include both infidclity and the probability of failure. To leading orders of 
k,n, this is equal to 2 - ( fe ~ 2 ' + exp [— vfc Sch([/)™]. We definc "inefficicncy" to include extra uses 
of U, net consumption of entanglcment, and the amount by which the coherent classical communi- 
cation rates fall short of the classical capacities. To leading order of k,n, these are respectively Rk, 
2a n k(C i í ' l) +CÍ n) ) + VkSch{U) n « 2a n kn{C 1 +C 2 ) + Vk Sch(£/) n , and nk(C 1 +C 2 ) - l(C[ n) +(% n) ) < 
nk(3a n (Ci+C2) + 2S n ). We would like the error to vanish, as well as the fractional inefficiency, 
defined as the inefficiency divided by kn, the number of uses of U. Equivalently, we can define /(fc, n) 
to be the sum of the error and the fractional inefficiency, and require that f(k,n) — ► as nk — > oo. 
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By the above arguments, 

f(k,n) < + exp(-Vk Sch([/) ïl ) + 2a n {C 1 +C 2 ) + ^= Sch(C/) n + — + 2>a n {Ci+C 2 ) + 2S n . 

(3.55) 

Note that for any fixed valuc of n, linifc^oo f(k,n) = 5a íl (Ci+C , 2) + 2S n + R/n. (This requircs k to 
be suflïciently large and also k ^> Sch([/) 2 ™.) Now, allowing n to grow, we have 

lim lim f(k,n) = 0. (3.56) 

The order of limits in this cquation is crucial due to the dependence of k on n. 

The only remaining problem is our catalytic use of 0(nk) ebits. In order to construct a protocol 
that uses only U, we need to first use U 0(nk) times to generate the starting entanglement. Then we 
rcpcat V'^ m times, reusing the same entanglement. The catalyst results in an additional fractional 
incfficicncy of c/m (for some constant c depending only on U) and the errors and incfficiencics of 
add up to no more than mf{k,n). Choosing m = f(k, n)\ will cause all of these errors and 

incfficicncies to simultancously vanish. (This tcchnique is essentially equivalent to using Lcmmas 1.36 
and 1.37 and Theorem 1.22.) The actual error condition is that 

c 

lim lim lim mf(k,n)-\ = 0. (3.57) 

m— »oo n— >oo k— »oo 777, 

This proves the resource inequality 

U > dic ^ cj + C 2 lc ^ cj. (3.58) 

The E < and E > cases 

If E < then entanglement is consumed in V n , so there exists a sequence of integers E^ < n(E + 6 n ) 

Pn(|o)A 1 |6)B 1 |*>^B ï ) - E |6')A 1 |a , )B 1 |7:; > V)A a B a • (3-59) 

a' ,b' 

In this case, the analysis for E^ n ' = goes through, only with additional entanglement consumed. 
Almost all equations are the same, except now the Schmidt rank for |roo) is uppcr-bounded by 
[Sch{U)2 E+s "] n instead of Sch([/)™. This is still < c" for some constant c, so the same proof of 
correetness applies. 

If instead E > 0, entanglement is created, so for some 

£(») 

> n(E — S n ) we have 
r n (\a) Al \b) Bl ) = E |6 , >A 1 |o / )B I |7^)A,B a • (3.60) 

a' ,ò' 

for / 3(|7a'^)A 2 b 2 ) ^ E^ n \ Again, the previous construction and analysis go through, with an extra 
Ek n ) ebits of entanglement of entropy in |Too), and thus an extra fractional efficiency of < 2a n E in 
Eq. (3.55). The Schmidt rank of |Too) is still upper bounded by Sch(í/)" in this case. □ 

Observation 3.10. If(C 1 ,C 2 ,E) G CCE(Í7) ; but (Ci,C 2 ,E + 5) CCE(C7) for any S > 0, then for 
any e, S > and for n sufficiently large there is a protocol V n and a state \tp) AB on < nnS qubits (for 
a universal constant k), such that for any x G {0, 1} \- ní -- Cl ~ s )\ , y g {0, ijL^ÍCa-í)] we /j awe e ií/ier 

V n \x) A \y) B « e \vy) A \xy) B \*)\-« E -W\<p) 

if E > or 

V n \x) A \y) B \^-^ E - 5 ^ w e \xy) A \xy) B \<p) 
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ifE < 0. 

The key point here is that if E taken to be the maximum possible for a given C\,Ci, then the 
above proof of Theorem 3. 1 in fact produces ancilla systems of a sublinear size. 

3.6 Discussion 

Quantum information, like quantum computing, has often been studied under an implícit "quantum 
co-processor" model, in which quantum resources are used by some controlling classical computer. 
Thus, we might use quantum computers or quantum channels to perform classical tasks, likc solving 
computational problcms. cncrypting or authcnticating a classical mcssagc, dcmonstrating nonlocal 
classical corrclations, synchronizing classical clocks and so on. On the other hand, since the quantum 
resources are manipulatcd by a classical computer, it is natural to think of conditioning quantum 
logical operations on classical information. 

This framework has been quite useful for showing the strengths of quantum information relative 
to classical information processing techniques; e.g. we find that secure communication is possible, 
distributed computations require less communication and so on. However, in quantum Shannon 
theory, it is easy to be misled by the central role of classical information in the quantum co-processor 
model. While classical communication may still be a useful goal of quantum Shannon theory, it is 
often inappropriate as an intermediate step. Rather, we find in protocol after protocol that coherently 
dccouplcd cbits are better thought of as cobits. 

Rcplacing cbits with cobits has significance beyond mercly improving the efficicncy of quantum 
protocols. In many cases, cobits give rise to asymptotically reversible protocols, such as coherent 
teleportation and super-dense coding, or more intcrcstingly, remote state preparation and HSW 
coding. The resulting resource equalities go a long way towards simplifying the landscape of quantum 
Shannon theory: (1) The duality of teleportation and super-dense coding resolves a long-standing 
open question about how the original forms of these protocols could be individually optimal, but 
wastcful when composed; we now know that all the irreversibility from composing teleportation and 
super-dense coding is due to the map [c — > c] > [c — *■ c]. (2) Coherent RSP and HSW coding give a 
resource cquality that allows us to easily derive an expression for unitary gate capacity regions. In 
the next chapter, we will see more examples of how making classical communication coherent leads 
to a wide variety of optimal coding thcorcms. 

Although the implications of coherent classical communication are wide-ranging, the fundamcntal 
insight is quite simple: when studying quantum Shannon theory, we should set aside our intuition 
about the central role of classical communication, and instead examine carefully which systems are 
discarded and when communication can be coherently decoupled. 
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Optimal trade-offs in quantum 
Shannon theory 

The main purposc of quantum information theory, or more particularly quantum Shannon theory, is to 
characterize asymptotic resource inter-conversion tasks in tcrms of quantum information theoretical 
quantities such as von Neumann entropy, quantum mutual and coherent informations. A particularly 
important class of problems involves a noisy quantum channcl or shared noisy entanglcment between 
two parties which is to be converted into qubits, ebits and/or cbits, possibly assisted by limited use 
of qubits, ebits or cbits as an auxiliary resource. In this final chapter on quantum Shannon theory, 
we give a full solution for this class of problems. 

In Scction 4.1, we will state two dual, purely quantum protocols: for entanglement distillation 
assisted by quantum communication (the "mother" protocol) and for entanglement assisted quantum 
communication (the "father" protocol). From these two, we can derive a large class of "children" 
(including many previously known resource inequalities) by direct application of teleportation or 
super-dense coding. The key ingredient to deriving the parents, and thus obtaining the entire family, 
is coherent classical communication. Spccifically, we will show how the parents can be obtained by 
applying Rulcs I and O to many of the previously known children. In each scenario, we will find that 
previous proofs of the children already use coherently decoupled cbits (or can be trivially modified 
to do so), so that the only missing ingredient is coherent classical communication. 

Next, we address the question of optimality. Most of the protocols we involve one noisy resource 
(such as (AP)) and two noiscless Standard ones (such as qubits and ebits), so instead of capacities 
we need to work with two-dimensional capacity regions whose boundaries determine trade-off curves. 
We state and prové formulae for each of these capacity regions in Section 4.2. 

Finally we give some ideas for improving these results in Section 4.3. 

Bibliographical note: Most of the chapter is based on [DHW05], though parts of Section 4.1 
appeared before in [DHW04]. Both are joint work with Igor Devetak and Andreas Winter. 



4.1 A family of quantum protocols. 

In this section, we consider a family of resource inequalities with one noisy resource in the input 
and two noiseless resources in either the input or the output. The "static" members of the family 
involve a noisy bipartite state p AB , while the "dynamic" members involve a general quantum channel 
ftf : A! — > B. In the former case one may define a class of purifications \t(;) (il>\ AB E 2 p AB ■ In the 
latter case one may define a class of pure states \if>) RBE , which corresponds to the outeome of sending 
half of some \cf)) RA through the channel's isometric extension Utf : A' — > BE, U^f D N . 
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Recali the identities, for a tripartite pure state \vjj) ABE , 

l -I(A;B)^+ l -I{A-E) 4 , = H{A)^ 

h(A;B)^-h(A;E)^ = I(A)B)^. 

Henceforth, all entropic quantitics will be dcfincd with respect to \ifj) RBE or \ijj) ABE , depending on 
the context, so we shall drop the ip subscript. 

We now introduce the "parent" resource inequalities, deferring their construction until the end 
of the section. The "mother" RI is a method for distillating entanglcment from a noisy state using 
quantum communication: 

{p) + h{A;E)[q^q]>h(A-B)[qq]. (?) 

There exists a dual "father" RI for entanglcment-assisted quantum communication, which is rclatcd 
to the mother by interchanging dynamic and static resources, and the A and R systems: 

\l{R;E) [qq] + (Ai) > \l(R;B) [q - q}. (tf) 

We shall combine these parent Ris with the unit Ris corresponding to teleportation, super-dense 
coding and entanglement distribution ([q — > q] > [qq]) to recover several previously known "children" 
protocols. 

Each parent has her or his own children (like the Brady Bunch*). 

Let us consider the mother first; she has three children. The first is a variation of the hashing 
inequality Eq. (1.49), which follows from the mother and teleportation. 

{p)+I{A;E)[c^c) + h{A;E)[qq] > (p) + ±I(A; E)[q q] 

> h{A-B)[qq] 
= I(A)B)[qq} + ±I(A;E)[qq}. 

By the Cancellation Lemma (1.37), 

(p) + I(A; E) [c^c}+ o[qq] > I(A)B) [qq]. (4.1) 

This is slightly weaker than Eq. (1.49). Further combining with teleportation gives a variation on 
noisy teleportation Eq. (1.50): 

(p)+I(A;B)[c^c)+o[qq}>I(A)B)[q^q}. (4.2) 

The third child is noisy super-dense coding (Eq. (1.48)), obtaincd by combining the mother with 
super-dense coding: 

H(A) [q^q] + (p) = \l{A- B) [q^q} + \l{A; E) [q - q] + (p) 

> h(A-B)[q^q} + h(A-B)[qq] 

> I(A-B)[c^c]. 



* The Brady Bunch, running from 26 September 1969 till 8 March 1974, was a popular show of the American 
Broadcasting Company about a couple with three children each from their previous marriages. For more information, 
see [Mor95]. 
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The father happens to have only two children. One of them is the entanglement-assisted classical 
capacity RI (1.46), obtained by combining the father with (SD) 

H(R)[qq] + (M) = ^I(R:B)[qq} + ^I(R;E)[qq] + (M) 

> ±I(R;B)[qq] + ±I(R;B)[q^q] 

> I(R\B)[c^> c]. 

The second is a variation on the quantum channel capacity result (Eq. (1.47)). It is obtained by 
combining the father with cntanglcmcnt distribution. 

h{R;E)[qq] + (N) > \j(R;B)[q^q] 

= l -I{R-E)[q^q] + l -I{R)B)[q^q] 
= \l{R;E)[qq] + \l{R)B)[q^q]. 

Hence, by the Canccllation Lcmma 

(M)+o[qq]>I{R)B)[q^q\. (4.3) 

Alas, we do not know how to get rid of the o term without invoking furthcr results. For instance, 
the original proof of the hashing inequality and the HSW theorem allow us to get rid of the o term, 
by Lemma 1.36. Quite possibly the original proof [Llo96, Sho02, Dev05a] is needed. 

Constructing the parent protocols using coherification rules. 

Having demonstrated the power of the parent resource inequalities, we now address the question 
of constructing protocols implementing them. 

Corollary 4.1. The mother RI is obtained from the hashing inequality (Eq. (1.4-9)) by applying rule 
I. 

It can be readily checked that the protocol from [DW05a, DW04] implementing Eq. (1.49) indeed 
satisfies the conditions of rule I. The approximate uniformity condition is in fact exact in this case. □ 

Corollary 4.2. The father RI follows from the EAC protocol from [BSST02]. 

Proof. The main observation is that the protocol from [BSST02] implementing Eq. (1.46) in fact 
outputs a private classical channel as it is! We shall analyze the protocol in the CP picture. Alice 
and Bob share a maximally entangled statc \&d) AB ■ Alice encodes her message m via a unitary U m : 

m i ► (U A ® t B ')\<ï> D ) AB ' = (1 A ® (Ul) B ')\<S> D ) AB ' . 

Applying the channel {U^ BE )® n yields 

\T m ) BB ' E = {{u^f®t BE m BB ' E , 

where \^j) BB E = (U^~ >BE ) tg>n \^D) AB ■ Bob's decoding operation consists of adding an ancilla system 
B in the state \0) B , performing some unitary U BB B and von Ncumann measuring the ancilla B. 
Bcfore the von Ncumann measurement the state of the total system is 

\r; n ) BB ' BE = u BB ' B \r m ) BB ' E \of. 

After the measurement, the message m is correctly decoded with probability 1 — e. By the gentle 
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operator lemma[Win99a], U BB B could have been chosen so that upon correct decoding, the post- 
measurement state \T' m ) BB E satisfics 

||T^-T m ||i<v/8Í. 

Assuming Bob correctly decodes m, he then applies to B' , bringing the system BB'E into the 
state \*' m ) = ({U* m ) B ' ® t BE )\V m ), for which 

||*^-*||i<V8Í, 

for all m. Thus m is coherently decoupled from BB'E, and we may apply Rule O. □ 

Corollary 4.3. The mother RI follows from the NSD protocol from [HHH + 01]. 

Proof. The proof is almost the same as for the previous Corollary. □ 

4.2 Two dimensional trade-offs for the family 

It is natural to ask about the optimality of our family of rcsource incqualities. In this section we 
show that they indeed give rise to optimal two dimensional capacity regions, the boundaries of which 
are referred to as trade-off curves. To each family member corresponds a theorem identifying the 
operationally defined capacity region C(p AB ) (C(AÍ)) with a formula C(p AB ) (C(AÍ)) given in terms 
of entropic quantities evaluated on states associated with the given noisy resource p AB (N). Each 
such theorem consists of two parts: the direct coding theorem which establishes C Ç C and the 
converse which establishes C Ç C. 

4.2.1 Grandmother protocol 

To prové the trade-offs involving static resources, we will first nced to cxtend the mother protocol 
(Eq. $) to a "grandmother" RI by combining it with instrument compression (Eq. 1.35). 

Theorem 4.4 (Grandmother). Given a static resource p AB , for any remote instrument T : A — > 
A' Xb, the following RI holds 

h(A';EE'\X B ) a [q^q}+I(X B :BE) a [c^c} + (p AB ) > l -I(A> ■ B\X B ) a [qq]. (4.4) 
In the above, the state <j XbA bee is defined by 

a X B A'BEE' _ íj\A—>A'E'X B !^ABE\ 

where \ip){i>\ ABE D p AB and T : A -> A'E'X B is a QP extension o/T. 
Proof. By the instrument compression RI (1.35), 

(p AB )+I(X B ;BE) a [c^c}+H(X\BEUcc] > (p AB ) + ÇK Xb ^ XaXb o T : p A ) 

> (A Xb ^ XaXb {o- XbA )). 

On the other hand, by Theorem 1.41 and the mother incquality ($), 

ÇK XB ^ XAXB (a x - A '))+ l -I{A'-EE'\X B ) a [q^q] > Íl(A';B\X B ) a [qq]. 

The grandmother RI is obtained by adding the above Ris, followed by a derandomization via Corol- 
lary 1.39. □ 
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Figuro 4-1: A general protocol for noisy super-dense coding. 



Corollary 4.5. In the above theorem, one may consider the special case where T : A — > A'Xg 
corresponds to some ensemble of operations (p x ,£ x )> £ x '■ A — * A' , via the identification 

T:p A ~YlP*\x)(x\ x » ®£ x {p A ). 

X 

Then the [c — > c] term from Eq. (4-4) vanishes identically. □ 
4.2.2 Trade-off for noisy super-dense coding 

Now that we are comfortable with the various formalisms, the formulae will reflect the QP formalism, 
whereas the language will be more in the CQ spirit. 

Given a bipartitc statc p AB , the noisy super-dense coding capacity region C^sd(p AB ) is the 
two-dimensional region in the (Q, R) plane with Q > and R > satisfying the RI 

(p AB )+Q[q^q}>R[c^c\. (4.5) 

Theorem 4.6. The capacity region Cnsd{p AB ) is given by 



C NSD (/ B ) = C NSD (/ B ) := IJ y^sU/^n, 

l=i 

where the S means the closure of a set S and C^ u (p AB ) is the set of all R> 0, Q > such that 

R< Q + ma,x{I(A')BX) (T : H{A'\X) a < Q} . 
In the above, a is of the form 

a *A'B =J2 Px \x)(x\ x ®£ A ^ A '(p AB ). (4.6) 

x 

for some ensemble of operations {j) x ,£ x ), £ x : A — > A' . 

Proof. We first prové the converse. Fix n, R, Q, 6, e, and use the Flattening Lemma (1.17) so that we 
can assume that k = 1. The resourecs available are 
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• The state (p AB )® n shared between Alice and Bob. Let it be contained in the system A n B n , of 
total dimension d n , which we shall call AB for short. 

• A perfect quantum channel id : A' — ► A', dimA' = 2 n ® , from Alice to Bob (after which A' 
bclongs to Bob despite the notation!). 

The resource to be simulated is the perfect classical channel of size D = 2 n ( R ~ s ï on any source, in 
particular on the random variable X corresponding to the uniform distribution 7Td. 

In the protocol (see Fig. 4-1). Alice performs a {cq — > q} cncoding (£ x : A — > A') x , dcpcnding 
on the source random variable, and then sends the A' system through the perfect quantum channel. 
After time t Bob performs a POVM A : A'B — ► X', on the system A'B, yielding the random variable 
X' . The protocol ends at time t / . Unless otherwise stated, the entropic quantities below refer to the 
state of the system at time t. 

Since at time i/ the state of the system XX' is supposed to be e-close to Lcmma 1.2 implies 

I(X; X') t} > n(R -6)- r/(e) - KenR. 

By the Holevo bound [Hol73], 

I{X;X% <I(X;A'B). 

Recali from Eq. (1.1) the idcntity 

I(X; A'B) = H(A') + I(A' )BX) - I(A':B) + I(X; B). 
Since I(A'; B) > 0, and in our protocol I(X; B) = 0, this becomes 

I(X; A'B) < H(A') + I{A')BX). 

Observing that 

nQ > H(A') > H(A'\X), 

these all add up to 

R<Q + -I(A' )BX) + S + KRe + lQ. 
n n 

As these are true for any e, 5 > and sufficicntly large n, the converse holds. 

Regarding the direct coding theorem, it sufhces to demonstrate the RI 

(p AB ) + H(A'\X) a [q^q}> I(A'; B\X) a [c^c]. 

This, in turn, follows from linearly combining Corollary 4.5 with super-dense coding (Eq. 1.40) much 
in the same way the noisy super-dense coding RI (Eq. 1.48) follows from the mother (Eq. $). □ 



4.2.3 Trade-off for quantum communication assisted entanglement distil- 
lation 

Given a bipartite state p AB , the quantum communication assisted entanglement distillation capacity 
rcgion ( or "mother" capacity region for short) Cm(p AB ) is the set of (Q, E) with Q > and E > 
satisfying the RI 

(p AB )+Q[q^q]>E[qq]. (4.7) 
(This RI is trivially false for Q < and trivially true for Q > and E > 0.) 
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Theorem 4.7. The capacity region Cm(p ) is given by 



1 ~ 

C M (p AB ) = C M (p AB ) := |J yC«((p^) 



Z 

where C^ l \p AB ) is the set of all Q >0, E >0 such that 

E<Q + max | J(A' )BX) a : \l{A>; EE'\X) a < Q J . (4.8) 

/n í/ie above, o~ is the QP version of Eq. (4-6), namely 

^XA'BEE' = J2 Px \x)(x\ x (g) C/^ A ' B '(^). (4.9) 

X 

for some ensemble of isometries (p x , U x ), U x : A — > A'E' , and purification \ip}(ip\ ABE ~D p AB . 

Proof. Wc first prové the converso, which in this casc follows from the converse for the noisy supcr- 
dense coding trade-off. The main observation is that super-dense coding (Eq. (1.40)) induces an 
invertible linear map / between the (Q,E) and (Q,iï) planes corresponding to the mother capacity 
region and that of noisy super-dense coding, rcspcctivcly, defined by 

f:{Q,E) i ► (Q + E,2E). 

By adding superdense coding (i.c. E[qq] + E[q — > q] > 2E[c — > c]) to the mother (Eq. 4.7), we find 

f(C M ) Ç C NSD . (4.10) 
On the other hand, by inspecting the definitions of Cnsd and Cm, we can verify 

Cnsd = /(Cm). (4.11) 

The converse for the noisy super-dense coding trade-off is writtcn as Cnsd Ç Cnsd- As / is a 
bijcction, putting cvcrything together wc have 

Cm Ç / 1 (Cnsd) Ç / 1 (Cnsd) = Cm, 

which is the converse for the mother trade-off. 

The dircet coding theorem follows immcdiatcly from Corollary 4.5. □ 

4.2.4 Trade-off for noisy teleportation 

Given a bipartite state p AB , the noisy super-dense coding capacity region Cntp(p AB ) is a two- 
dimcnsional region in the (R, Q) plane with R > and Q > satisfying the RI 

(p AB )+R[c^c} >Q[q^q}. (4.12) 
Theorem 4.8. The capacity region Cntp(p AB ) is given by 



C NTP (p AB ) = Cntp(/ B ) := IJ jC£U(P AB )® 1 ), 



l 

1=1 
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Figurc 4-2: A general protocol for noisy teleportation. 



where C, (1) ' - AB 



ntp(<° ) * s the set of all R> 0, Q > such that 

Q < max {7(A' )BA) (T : 7(A'; B\X) a + 7(X; BB) ff < R} 



In the above, o~ is of the form 



XA' BE 



T(V^), 



(4.13) 
(4.14) 



for some instrument T : A — > A'X and purification \tp)(ijj\ ABE 5 p AB . 



Proof. Wc first prové the converse. Fix n, Q, R, 5, e, and use the Flattcning Lemma so we can assumc 
that the depth is onc. The resources available are 

• The statc (p AB ~)® n shared between Alice and Bob. Let it be contained in the system A n B n , 
which we shall call AB for short. 

• A perfect classical channel of size 2 nR . 

The resource to be simulated is the perfect quantum channel id^i : Ai — * B\, D = dim A\ = 2 n ^~ s > , 
from Alice to Bob, on any source, in particular on the maximally entangled state $> A Al . 

In the protocol (see Fig. 4-2), Alice performs a POVM A : AA\ — > X on the system AAi, and 
sends the outeome random variable X through the classical channel. After time t Bob performs a 
{cq — > q} decoding quantum operation T> : XB — > B\. The protocol ends at time í/. Unless otherwise 
stated, the entropic quantities below refer to the time t. 

Our first observation is that performing the POVM A induces an instrument T : A — > A'X* so 
that the state of the system XA' BE at time t is indeed of the form of Eq. (4.14). 

Since at time tf the state of the system A' B\ is supposed to be e-close to Lemma 1.2 implics 

H A 'Wt f > n (Q - <*) - - KenQ. 
By the data processing inequality, 

I{A')B 1 ) tf < I(A')BX). 

Thus 

+■ KQe 



Q < -I(A')BX) 
n 



(4.15) 



*Indoed, first a pure ancilla A! A\ was appended, then another pure ancilla X was appcndcd, the system AA 1 A\X 
was rotatcd to A' E'X, and finally X was measured and E' was traced out. 
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To bound R, start with the identity 

I(X; A'BE) = H(A') + I(A' )BEX) - I(A'\ BE) + I(X; BE). 
Since I(A';BE) = 0, H(A') > H(A'\X) and I(A')BEX) > I(A')BX), this becomes 

I(X; A'BE) > I(A';B\X) + I(X; BE). 



Combining this with 
gives the desired 



nR > H{X) > I(X; A'BE) 

R>-[I(A';B\X) + I(X;BE)]. (4.16) 

n 

As Eqns. (4.15) and (4.16) are true for any e,S > and sufficicntly large n, the converse holds. 
Rcgarding the direct coding theorem, it suffices to demonstrate the RI 

(p AB ) + (I(A'; B\X) a + I(X; BE).) [c^c}> I{A')BX)„ [q -> q\. (4.17) 

Linearly combining the grandmother RI (Eq. (4.4)) with teleportation (Eq. (1.39)), much in the same 
way the variation on the noisy teleportation RI (Eq. (4.2)) was obtained from the mother (Eq. ($)), 
we have 

(p AB ) + (I(A'; B\X) a + I(X; BE).) [c c] + o[qq] > I(A' )BX) a [q q}. 
Equation (4.17) follows by invoking Lemma 1.36 and Eq. (1.49). □ 

4.2.5 Trade-off for classical communication assisted entanglement distil- 
lation 

Given a bipartite state p AB , the classical communication assisted entanglement distillation capacity 
region (or "entanglement distillation" capacity region for short) Ceu(p AB ) is the two-dimensional 
region in the (R, E) plane with R > and E > satisfying the RI 

(p AB )+R[c^c}>E[qq]. (4.18) 

Theorem 4.9. The capacity region Ced(p AB ) is given by 



cMp ab ) = cMp ab ) ■■= U i^((p AB n 

1=1 

where C^^y(p AB ) is the set of all R> 0, E > such that 

E < max{I(A')BX) a : I(A';EE'\X) a + I{X;BE) a < R} , (4.19) 

(7 

In the above, a is the fully QP version of Eq. (4-14)? namely 

a XA>BEE> = TtyABE^ (4 2Q) 

for some instrument T : A — > A'E'X with pure quantum output and purification {ip\ AB E D p AB . 

Proof. Wc first prové the converse, which in this casc follows from the converse for the noisy tele- 
portation trade-off. The argument very much parallels that of the converse for the mother trade-off. 
The main observation is that teleportation (Eq. (1.39)) induces an invertible linear map g between 
the (R, E) and (R, Q) planes corresponding to the entanglement distillation capacity region and that 
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of noisy teleportation, respectively, defined by 

g:(R,E)^(R + 2E,E). 
By applying TP to Eq. (4.18), we find 

<?(C ED ) Ç C NT p. (4.21) 

On the other hand, from the definitions of Ced and Cntp (Eqns. (4.19) and (4.13)), we have 

Ced = 5(Cntp). (4.22) 

The converse for the noisy teleportation tradc-off is written as Cntp C Cntp- As g is a bijcction, 
putting evcrything togcther we havc 

Ced Ç g 1 (Cntp) Q g 1 (Cntp ) = Ced, 

which is the converse for the entanglemcnt distillation trade-off. 

Regarding the direct coding theorem, it suffices to demonstrate the RI 

(p AB ) + (I(A'; EE'\X) a + I(X; BE) a ) [c ^ c] > I(A' )BX) a [qq]. (4.23) 

Linearly combining the grandmother RI (Eq. (4.4)) with teleportation (1.39), much in the same way 
the variation on the hashing RI (Eq. (4.1)) was obtained from the mother (Eq. ($)), we have 

(p AB ) + (I{A'-EE'\X) a + I(X; BE) a ) [c c] + o[qq] > I(A' )BX) a [q^q]. 

Eq. (4.23) follows by invoking Lemma 1.36 and Eq. (1.49). □ 

4.2.6 Trade-off for entanglement assisted quantum communication 

Given a noisy quantum channcl Af : A' — > B, the entanglemcnt assisted quantum communication 
capacity region ( or "father" capacity region for short) Cp(AÍ) is the region of (E,Q) plane with 
E>0 and Q>0 satisfying the RI 

W+E[qq]>Q[q->q]. (4.24) 
Theorem 4.10. The capacity region Cf(A/") is given by 

C F (AÍ) = C F (AÍ) := Q jCÇ* 
i=i 

where Cp (Ai) is the set of all E > 0, Q > such that 

Q < E + I{A)B) a 
Q < \l(A;B) a . 

In the above, a is of the form 

ABE tt ct±AA"\ 

a = IV o £(0 ), 

for some pure input state \tp AA ), encoding operation £ : A" — > A', and where Uj^ : A' — > BE is an 
isometric extension ofAÍ. 

This tradeoff region includes two well-known limit points. When E — 0, the quantum capacity of 
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Figuro 4-3: A general protocol for entanglcmcnt assisted quantum communication. 



Aí is I(A)B)[Llo96, Sho02, Dev05a], and for E > 0, entanglement distribution ([q — > q] > [qq]) means 
it should still bc boundcd by I(A)B)+E. On the other hand, whcn given unlimitcd entanglcmcnt, thc 
classical capacity is I(A; £?)[BSST02] and thus the quantum capacity is never greater than ^I(A; B) 
no matter how much entanglement is available. These bounds meet when E = ^I(A;E) and Q = 
\l{A; E), the point corresponding to the father protocol. Thus, the goal of our proof is to show that 
the father protocol is optimal. 



Proof. We first prové the converse. Fix n, E, Q, (5, e, and use the Flattening Lemma to reduce the 
depth to one. The resources available are 

• The channel N® n : A' n -> B n from Alicc to Bob. Wc shall shorten A' n to A' and B n to B. 

• Thc maximally cntanglcd statc Q TaTb , dimT/i = dimTg = 2 nE , shared between Alice and 
Bob. 

The resource to be simulated is the perfect quantum channel id^j : Ai — > B\ , D = dim A\ = 2 n ^^ s \ 
from Alice to Bob, on any source, in particular on the maximally entangled state <& RAl . 

In the protocol (see Fig. 4-3), Alice performs a general encoding map £ : A\Ta — ► A'E' and 
sends the system A' through the noisy channel Aí : A' — * B. After time t Bob performs a decoding 
operation D : BTb —> B\. The protocol ends at time tj. Unlcss otherwise stated, the entropic 
quantities below refer to the time t. 

Dcfinc A := RTb and A" := A\Ta- Since at time tf the statc of the system RBi is supposed to 
be e-close to Lemma 1.2 implies 

I{R)B 1 ) t} > n{Q -S)- r]'(e) - KenQ. 

By the data processing inequality, 

I{R)B x ) t , <I{R)BT B ). 

Together with the inequality 



I(R)BT B ) < I(RT B )B)+H{T B ), 
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since E = H(Tb), the above implies 



Combining this with 



gives 



Q < E + -I(A)B) +S + KQe + 

n n 



H(A) = H(R) + H(T B ) = nQ + nE. 



< ^-I(A; B) + 5/2 + KQe/2 ' V ^ 



2n 2n 
As these are true for any e, 5 > and sufficicntly large n, the converse holds. 
Rcgarding the direct coding theorem, it follows directly form the father RI 

(Ai) + \I{A; E) a [q q] > \l{A; B) a [q^q\. 

□ 



4.2.7 Trade-off for entanglement assisted classical communication 

The result of this subsection was first proved by Shor in [Sho04b]. Here we state it for complcteness, 
and give an independent proof of the converse. An alternative proof of the direct coding theorem was 
sketched in [DS03] and is pursued in [DHLS05] to unify this result with the father trade-off. 

Given a noisy quantum channel Aí : A' — > B, the entanglement assisted classical communication 
capacity region (or "entanglement assisted" capacity region for short) Cea(A0 is the set of all points 
(E, R) with E > and R > satisfying the RI 

(Aí) + E[qq] > R[c—> c}. (4.25) 

Theorem 4.11. The capacity region Cea(AÍ) is given by 



c EA (A0 = c EA (A0 := |J -c£i(Ar®i), 

1=1 

where C^ÇAÍ) is the set of all E > 0, R > such that 

R < max {I(AX; B) a : E > H(A\X) a } . (4.26) 

a 

In the above, a is of the form 

a XAB =Y,P x \x)(x\ X ®M{<t>Í A 'l (4-27) 

X 

for some pure input ensemble (p x , \(fi x ) AA )x- 

Proof. Wc first prové the converse. Fix n,E^Q,5 7 e, and again use the flattening lcmma to reducc 
dcpth to one. The resources available are 

• The channel Af® n : A' n -> B n from Alicc to Bob. We shall shortcn A' n to A' and B n to B. 

• The maximally entangled state $ TaTb , dimT^ = dimTs = 2 nE , shared between Alice and 
Bob. 

The resource to be simulated is the perfect classical channel of size D = 2 n ( R ~ s " > on any source, in 
particular on the random variable X corrcsponding to the uniform distribution ttd- 
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Figure 4-4: A general protocol for entanglement assisted classical communication. 



In the protocol (see Fig. 4-4), Alice performs a {cq — > q} encoding (£ x : Ta — ► A') x , depending on 
the source random variable, and then sends the Ta system through the noisy channel Aí : A! — ► BE. 
After time t Bob performs a POVM A : TbB — > X', on the system T B B, yielding the random variable 
X' . The protocol ends at time t f . Unless otherwise stated, the entropic quantities below refer to the 
state of the system at time t. 

Sincc at time tf the state of the system XX' is supposed to be e-close to Lemma 1.2 implies 

I(X; X') t , > n(R -6)- r{{e) - KenR. 

By the Holevo bound 

I(X;X% <I{X-T B B). 

Using the chain rule twice, we find 

I{X;T B B) = I(X;B\T B )+I(X;T B ) 

= I(XT B ;B) + I(X;T B )-I(T B ;B) 

Since I(T B ; B) > and in this protocol I(X; T B ) = 0, this becomes 

I{X;T B B) > I{XT B -B). 



These all add up to 



while on the other hand, 



R < -I(XT B ; B) + S + Kde + — , 

n n 



nE > H(T B \X). 



As these are true for any e, 6 > and sufficicntly large n, we have thus shown a variation on the 
converse with the state a from (4.27) replaced by <r, 



-XABE' _ \^ „ ^ \r „ ttA"~*A'E' íxAA" 



X 

dcfining A :=T B and lctting U x : T4 A'E' be the isometric extcnsion of £ x . 

However, this is a weaker result than we would like; the converse we have proved allows arbitrary 
noisy cncodings and we would like to show that isometric encodings are optimal, or equivalently that 
the E' register is unnecessary. We will accomplish this, following Shor [Sho04a], by using a Standard 
trick of measuring E' and showing that the protocol can only improve. If we apply the dephasing 
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map id : E' — ► Y to a 



~ABE' 



, we obtain a state of the form 



XYAB 



J2pM(x\ x ®\y)(y\ Y ®múy')- 



■>:y 



The converso now follows from 



I(B;AX) 9 < I{B-AXY) a 
H{A\xy a > H(A\XY) a . 



□ 



4.3 Conclusion 

The goal of quantum Shannon theory is to give information-theoretic formulae for the rates at which 
noisy quantum resources can be converted into noiseless ones. This chapter has taken a major 
step towards that goal by finding the trade-off curves for most one-way communication scenarios 
involving a noisy state or channel and two of the three bàsic noiseless resources (cbits, ebits and 
qubits). The main tools required for this wcre the resource formalism of Chapter 1. coherent classical 
communication (from Chapter 3), derandomization and bàsic protocols like HSW coding. 

However, our expressions for trade-off curves also should be seen more as first steps rather than 
final answcrs. For one thing, wc would ultimatcly like to have formulae for the capacity that can be 
cfficicntly computed, which will probably require rcplacing our current regularized expressions with 
single-lctter ones. This is relatcd to the additivity conjectures, which are equivalent for some channel 
capacitics[Sho03], but are false for others[DSS98]. 

A more reasonable first goal is to strengthen some of the converse theorems, so that they do not 
require maximizing over as many different quantum operations. As inspiration, note that [BKNOO] 
showed that isometric encodings suffice to achieve the optimal rate of quantum communication 
through a quantum channel. However, the analogous result for cntanglemcnt-assistcd quantum com- 
munication is not known. Spccifically, in Fig. 4-3, 1 suspect that the E' register (used to discard some 
of the inputs) is only necessary when Alice and Bob sharc more entanglement than the protocol can 
usc. Similarly, it scems plausible to assume that the optimal form of protocols for noisy teleportation 
(Fig. 4-2) is to perform a general TPCP preprocessing operation on the shared entanglement, followcd 
by a unitary interaction betwccn the quantum data and Alice's part of the cntangled state. Thesc 
are only two of the more obvious examples and there ought to be many possible ways of improving 
our formulae. 



Chapter 5 

The Schur transform 



5.1 Overview 



The final four chapters will explore the uses of Schur duality in quantum computing and information 
theory. Schur duality is a natural way to decompose (C d )®" in terms of representations of the 
symmetric group S n and the unitary group Líd- In this chapter, we will describe Schur duality 
and dcvelop its representation-theoretic background within the framework of quantum information. 
The primary connection between thcse fields is that a vector space can be interpreted either as 
a representation of a group or as state-space of a quantum system. Thus, Schur duality can be 
interpreted both as a mathematical fact about representations and operationally as a fact about the 
transformations possible on a quantum system. 

Chapter 6 will describe how Schur duality is useful in quantum information theory. We will 
see that Schur duality is a quantum analogue of the classical method of types, in which strings are 
described in terms of their empirical distributions. This has a number of applications in information 
theory, which we will survey while highlighting the role of Schur duality. The chapter concludes with 
new work describing how i. i. d. quantum channels can be decomposed in the Schur basis. 

We then turn to computational issues in Chapters 7 and 8. The unitary transform that relates the 
Schur basis to the computational basis is known as the Schur transform and presenting emeient circuits 
for the Schur transform is the main goal of Chapter 7. Thcse circuits mean that the information- 
theoretic tasks described in Chapter 6 can now all be implemented efhciently on a quantum computer; 
even though computational efficiency is not often considered in quantum information theory, it will 
be necessary if we ever expect to implement many of the coding schemes that exist. 

Finally, Chapter 8 discusses algorithmic connections between the Schur transform and related ef- 
ficient representation-theoretic transforms, such as the quantum Fourier transform on S n . Ultimatcly 
the goal of this work is to find quantum speedups that use either the Schur transform or the S n 
Fourier transform. 

Most of the original work in this chapter has not yet been published. The next two chapters are 
mostly review, although there are several places where the material is assembled and presented in 
ways that have not seen before in the literature. The exception is the last section of Chapter 6 on 
decomposing i. i. d. quantum channels, which is a new contribution. The last two chapters are joint 
work with Dave Bacon and Isaac Chuang. Parts of Chapter 7 appeared in [BCH04] and the rest of 
the chapter will be presented in [BCH06a]. Chapter 8 will become [BCH06b]. 



115 



116 



5.2. REPRESENTATION THEORY AND QUANTUM COMPUTING 



5.2 Representation theory and quantum computing 
5.2.1 Bàsics of representation theory 

In this section, we review aspects of representation theory that will be used in the second half of the 
thesis. For a more detailed description of representation theory, the reader should consult [Art95] 
for general facts about group theory and representation theory or [GW98] for representations of Lie 
groups. See also [FH91] for a more introductory and informal approach to Lie groups and their 
representations. 

Representations: For a complex vector space V, define End(V) to be set of linear maps from 
V to itself (endoinorphisms). A representation of a group G is a vector space V together with a 
homomorphism from G to End(V), i.c. a function R : G — > End(U) such that R(<7i)R(<7 2 ) = R(gi<72)- 
If R(<?) is a unitary operator for all g, then we say R is a unitary representation. Furthermore, we say 
a representation (R, V) is finitc dimcnsional if V is a finitc dimcnsional vector space. In this thesis, 
we will always consider complex finitc dimcnsional, unitary representations and use the genèric term 
'representation' to refer to complex, finitc dimcnsional, unitary representations. Also, whcn clcar 
from the context, we will denote a representation (R, V) simply by the representation space V. 

The reason we consider only complex, finitc dimcnsional, unitary representations is so that we 
can use them in quantum computing. If d = dim V, then a d-dimensional quantum systcm can hold 
a unit vector in a representation V. A group element g G G corresponds to a unitary rotation R(.g), 
which can in principio bc performed by a quantum computer. 

Homomorphisms: For any two vector spaces V\ and Vz, define Hom(Vi, V2) to be the set of linear 
transformations from V\ to V2 ■ If G acts on V\ and V2 with representation matrices Ri and R2 then 
the canonical action of G on Hom(Vi, V2) is given by the map from M to H 2 (g)M'R-i(g)~ 1 for any 
M G Hom(Ui, V-i). For any representation (R, V) define V G to be the space of G-invariant vectors of 
V: i.c. V G := {\v) G V : H(g)\v) = \v)\fg G G}. Of particular interest is the space Hom(Vi, V 2 ) G , 
which can be thought of as the linear maps from Vi to V2 which commute with the action of G. 
If Hom(Vi, V2) G contains any invertible maps (or equivalently, any unitary maps) then we say that 
(Ri, V\) and (R2, V2) are equivalent representations and write 

G 

V 1 Sí Vi. 

This means that there exists a unitary change of basis U : V\ — ► V 2 such that for any g G G, 
UB.!(g)W = R 2 (. 9 ). 

Dual representations: Recali that the dual of a vector space V is the set of linear maps from V 

to C and is denoted V* . Usually if vectors in V are denoted by kets (e.g. \v)) then vectors in V* 

are denoted by bras (e.g. (v\). If we fix a basis {|i>i), \vz), . . .} for V then the transpose is a linear 

map from V to V* given by \ví) — ► Now, for a representation (R, V) we can define the dual 

representation (R*,V*) by R*(g)(v*\ := (v* |R(g _1 ). If we think of R* as a representation on V 

(using the transpose map to relate V and V*), then it is given by R*(<?) = (R(g" 1 )) T . When R is a 

unitary representation, this is the same as the conjugate representation R(g)* , where here * denotes 

the entrywise complex conjugate. One can readily verify that the dual and conjugate representations 

G 

are indeed representations and that Hom(Vi, V2) = V{ ® V 2 - 

Irreducible representations: Generically the unitary operators of a representation may be specificd 
(and manipulated on a quantum computer) in an arbitrary orthonormal basis. The added structure of 
being a representation, however, implics that therc arc particular bases which are more fundamcntal 
to expressing the action of the group. We say a representation (R, V) is irreducible (and call it 
an irreducible representaiton, or irrep) if the only subspaces of V which are invariant under R arc 
the empty subspacc {0} and the entire space V. For finite groups, any finite-dimensional complex 
representation is reducible; meaning it is decomposable into a direct sum of irreps. For Lie groups, we 
necd additional conditions, such as demanding that the representation R(.g) be rational; i.e. its matrix 
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elements are polynomial functions of the matrix elements gij and (dct g) . We say a representation 
of a Lie group is polynomial if its matrix elements are polynomial functions only of the gij . 

Isotypic decomposition: Let G be a complete set of incquivalent irreps of G. Then for any reducible 
representation (R, V) there is a basis under which the action of R(<?) can be expressed as 

n\ 

R (5) = 00 rx(g) = r x (g) ® I nx (5.1) 

\eÓ3 =1 xeó 

where X € G labels an irrep (r A , V\) and n\ is the multiplicity of the irrcp À in the representation V . 
Here we use = to indicate that there exists a unitary change of basis relating the left-hand size to the 
right-hand side.* Under this change of basis we obtain a similar decomposition of the representation 
space V (known as the isotypic decomposition): 

uÍ@Fa®C"\ (5.2) 
xeó 

Thus while generically we may be given a representation in some arbitrary basis, the structure of 
being a representation picks out a particular basis under which the action of the representation is not 
just block diagonal but also maximally block diagonal: a direct sum of irreps. 

Moreover, the multiplicity space C™ A in Eq. (5.2) has the structure oï Hom(V\,V) . This means 
that for any representation (R, V), Eq. (5.2) can be restated as 

UÍ 014®Hom(l/ A ,l/) G . (5.3) 

AEG 

bmcc G acts trivially on Hom(U A ,U) G , Eq. (5.1) rcmains the same. As with the othcr rcsults in 
this chapter, a proof of Eq. (5.3) can be found in [GW98], or other Standard texts on representation 
theory. 

The value of Eq. (5.3) is that the unitary mapping from the right-hand side (RHS) to the left-hand 
side (LHS) has a simple cxplicit expression: it corrcsponds to the canonical map (p : y4(g)Hom(^4, B) — > 
B given by ip(a <8> /) = f(a). Of course, this doesn't teli us how to describe Hom(VA,U) G , or how 
to specify an orthonormal basis for the space, but we will later find this form of the decomposition 
useful. 



5.2.2 The Clebsch-Gordan transform 

If (R M , V/j,) and (R„, V v ) are representations of G, thcir tensor product (R p g) R^, Cg> V v ) is anothcr 
representation of G. In general if V^ and V v are irreduciblc, thcir tensor product will not ncccssarily 
be. According to Eq. (5.3), the tensor product decomposes as 

G G 

V» ® V v Sí V x <8 Hom(U A , V/j ® V v f = U A ® C M - , (5.4) 
xeó xeó 

where we have defined the multiplicity := dimHom(VA, V/j <E> V V ) G . When G = Ud : the M* v are 

known as Littlewood-Richardson coeficients. 

The decomposition in Eq. (5.4) is known as the Clebsch-Gordan (CG) decomposition and the 
corresponding unitary map Uçq is called the CG transform. On a quantum computer, we can think 
of Uqq as a map from states of the form |^)|^) to superpositions of states |A)|u a )|qí), where AeG 

G 

*We only need to use = whcn relating representation spaces. In Eq. (5.1) and other similar isomorphisms, we instead 
cxplicitly specify the dependence of both sides on g € G. 



118 



5.2. REPRESENTATION THEORY AND QUANTUM COMPUTING 



labels an irrep, \v\) is a basis state for V\ and a £ Hom(VA,V^ ® V„) G . Using the isomorphism 

G 

ïíom(A, B) = A* ® B we could also write that \a) £ (V x * (g> <g> V^) ; an interpretation which makcs 
it more obvious how to normalize a. 

There are a few issues that arisc when implcmcnting the map Uqq . For cxamplc, sincc difïercnt 
V\ (and different multiplicity spaces) have different dimensions, the register for \v\) will need to be 
padded to at least [logmaxA dim Va] qubits. This means that the overall transformation will be an 
isomctry that slightly enlarges the Hilbcrt space, or equivalently, will be a unitary that requires the 
input of a small numbcr of ancilla qubits initializcd to |0). Also, when G has an infinite number of 
incquivalent irrcps (c.g. when G is a Lie group) then in ordcr to storc A, we need to consider only 
some finite subset of G. Fortunately, there is usually a natural way to pcrform this rcstriction. 

Rcturning to Eq. (5.4) for a moment, note that all the complexity of the CG transform is pushed 
into the multiplicity space Hom(VA,V r AI ® V V ) G . For example, the fact that some vàlues of A don't 
appear on the RHS means that some of the multiplicity spaces may be zero. Also, the inverse 
transform (Uqq y is given simply by the map 

(U^\X)\v x )\a)=a\v x ). (5.5) 

We will use these properties of the CG transform when decomposing i. i. d. channcls in Section 6.4 
and in giving an efficient construction of the CG transform in Section 7.3. 



5.2.3 The quantum Fourier transform 

Let G be a finite group (we will return to Lie groups later). A useful representation is given by 
letting cach g £ G define an orthonormal basis vector \g). The rcsulting space Spanjjg) : g £ G} is 
denoted C[G] and is called the regular representation. G can act on C[G] in two different ways: left 
multiplication í(g)\h) :~ \gh), and right multiplication R(g)\h) :~ \hg ). This means that there 
are really two different regular representations: the left regular representation (L, C[G]) and the right 
regular representation (R, C[G]). Since these representations commute, we could think of L(g 1 )R(g 2 ) 
as a representation of G x G. Under this action, it can be shown that C[G] decomposes as 

C[G] G ^ G 0U A èu;. (5.6) 

A6G 

Here the V\ correspond to L and V x corresponds to R, and é> is used to emphasize that we are not 
considering the tensor product action of a single group, but rather are taking the tensor product of 
two irrcps from two different copies of the group G. This means that if we decompose only one of 
the regular representations, e.g. (L,C[G]), the V x in Eq. (5.6) becomes the multiplicity space for V\ 
as follows: 

C[G] i V x ®C dimV \ (5.7) 

A6G 

A similar expression holds for (R, C[G]) with V x * appearing instead of V\. 

The unitary matrix corresponding to the isomorphism in Eq. (5.6) is called the Fourier transform, 
or when it acts on quantum registers, the quantum Fourier transform (QFT). Denote this matrix by 
[/qft- For any 51,52 é G we have 

L(5i)R(s2) := C/q FT L( 5i )R( 52 )í7^ ft = ^ |A)(A| ® r A ( 5l ) ® r x (g 2 )*, (5.8) 

xeó 

where L and R arc the Fourier transformed versions of L and R; L(g) := C/QFTL(3)t^Qpx an d 

R(.9) ~ U QF TR{g)uy T . 
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Unlike the CG transform, the Fourier transform has a simple explícit expression. 

dim V\ ÍT. 77~ 

C/ Qft = EE E \ ^GT^{9)ij\\h3){9\ (5.9) 

The best-known quantum Fourier transform is over the cyclic group G = Zjv- Here the form is partic- 
ularly simple, since all irreps are one-dimcnsional and the set of irreps G is equivalent to Zjy. Thus the 
\i,j) register can be neglected and we obtain the familiar expression yeIiN N~ 1 ' 2 e 2mxy ' N \y){x\. 
The ability of a quantum computer to efhciently implement this Fourier transform is at the heart of 
quantum computing's most famous advantages over classical computation[Sho94]. 

Quantum Fourier transforms can also be efhciently implemented for many other groups. 
Bcals[Bca97] has shown how to implement the S n QFT on a quantum computer in poly(n) time, 
Püschcl, Ròtteler and Bcth[PRB99] have given cfficient QFTs for other nonabelian groups and Moore, 
Rockmore and Russcll[MRR04] have gcneralizcd thesc approachcs to many other finite groups. Fourier 
transforms on Lie groups are also possible, though the infinite-dimensional spaces involved lead to 
additional complications that we will not discuss here. Later (Section 7.3) we will give an efficient 
algorithm for a Ud CG transform. However, if some sort of Ud QFT could be efhciently constructed 
on a quantum computer, then it would yield an alternate algorithm for the Ud CG transform. We 
will discuss this possibility further in Section 8.1.3 (see also Prop 9.1 of [Kup03]) and will discuss the 
S n QFT more broadly in Chapter 8. 



5.3 Schur duality 

We now turn to the two representations relevant to the Schur transform. Recali that the symmetric 
group of degree n, S ni is the group of all permutations of n objeets. Then we have the following 
natural representation of the symmetric group on the space (C d )®": 

P(s)|íi) <g> \i 2 ) <X> • • • <E> \i n ) = Ns-i(l)) ® |«"a-i(2)) ® ' ' ' ® Ka-^n)) ( 5 - 10 ) 

where s € S n is a permutation and s(i) is the label describing the action of s on label i. For examplc, 
consider the transposition s = (12) belonging to the group S3. Then P(s)|ii, i 2 , Í3) = \i 2 ,ii,Í3)- 
(P, (C d )® n ) is the representation of the symmetric group which will be relevant to the Schur transform. 
Note that P obviously depends on n, but also has an implicit dependence on d. 

Now we turn to the representation of the unitary group. Let Ud denote the group of d x d unitary 
operators. Then there is a representation of Ud given by the n-fold product action as 

Q(U)\h) <g> \i 2 ) «>•••«> \i n ) = U\h) <g> U\i 2 ) ® ■ ■ ■ (8) U\i n ) (5.11) 

for any U € Ud- More compactly, we could write that Q(Í7) = U® n . (Q, (C d )®") is the representation 
of the unitary group which will be relevant to the Schur transform. 

Since both P(s) and Q(Í7) meet our above criteria for reducibility, they can each be decomposcd 
into a direct sum of irreps as in Eq. (5.1), 

P(s) 0J ne>( g)p a (s) 

Q(J7) U 4 0j m ^ q/3 ([/) (5.12) 



where n a (mp) is the multiplicity of the ath (/3th) irrep p Q (s) (q ( 3(C/)) in the representation P(s) 
(Q(t/)). At this point there is not necessarily any relation between the two different unitary trans- 
forms implementing the isomorphisms in Eq. (5.12). However, further structure in this decomposition 
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follows from the fact that P(s) commutes with Q(Í7): P(s)Q(í7) = Q(t/)P(s). This implies, via 
Schur's Lemma, that the action of the irreps of P(s) must act on the multiplicity labels of the irreps 
Q(U) and vice versa. Thus, the simultaneous action of P and Q on (C d )® n decomposes as 

Q(C/)P(s) Sí 00í rao ,®q í (CÍ)«p a (s) (5.13) 

a 

where m 0>( g can be thought of as the multiplicity of the irrep q i a({7)®p a (s) of the group Ud x S n . 

Not only do P and Q commute, but the algebras they generate (i.e. A := P(C[<S n ]) = Span{P(s) : 
s G S n } and B := Q(C[%]) = Span{Q([/) : U G Ud}) centralíze each other[GW98], meaning that B 
is the set of operators in End((C d )® n ) commuting with A and vice versa, A is the set of operators 
in End((C d )®") commuting with B. This means that the multiplicities m a t p are either zero or one, 
and that each a and (3 appears at most once. Thus Eq. (5.13) can be furthcr simplificd to 

Q(C/)P( S ) S '^ Ud 0q A (C/) <8> Px(s) (5.14) 

A 

where A runs over some unspccified set. 

Finally, Schur duality (or Schur-Wcyl duality) [GW98] providcs a simple characterization of the 
range of A in Eq. (5.14) and shows how the decompositions are related for different vàlues of n and 
d. To define Schur duality, we will necd to somchow spccify the irreps of S n and Ud- 

Lct Xd.n = {A = (Ai, A2, . . • , Ad) | Ai > A2 > • ■ ■ > A f / > and X^=i A = n } denote partitions of 
n into < d parts. We consider two partitions (Ai, . . . , Ad) and (Ai, . . . , Ad, 0, . . . , 0) equivalent if they 
differ only by trailing zeroes; according to this principle, T n := T nm contains all the partitions of n. 
Partitions label irreps of S n and Ud as follows: if we let d vary, then Id,n labels irreps of S n , and if 
we let n vary, then Id,n labels polynomial irreps of Ud- Call these (p\,V\) and (q^, Qf) respectively, 
for A G Id,n- We need the superscript d because the same partition A can label different irreps for 
different Ud', on the other hand the 5„-irrep V\ is uniquely labeled by A since n = JA \. 

For the case of n qudits, Schur duality states that there exists a basis (which we label | A) \q\) |pa)sc1i 
and call the Schur basis) which simultancously decomposes the action of P(s) and Q(Í7) into irreps: 

Q(£0|A)|e? A >K) Sch = \X)(q d x (U)\q x ))\p x ) Sch 

P( s )\X)\q x )\ Px ) Sch = |A)|gx)(p A (í)bA»Soh (5.15) 
and that the common representation space (<C d )® n decomposes as 

(C d )® n Ud ^ n QÍ^Vx. (5.16) 

The Schur basis can be expressed as supcrpositions over the Standard computational basis states 
«2, ...,in) as 

|A,<ZA,PA) S ch= E [UschÈ^ in |Íl<2...in>, (5-17) 

Íl,Í2,---,Ín 

where Vg c ^ is the unitary transformation implcmcnting the isomorphism in Eq. (5.16). Thus, for any 
U G Ud and any s G S n , 

U Sch Q(U)P(s)ul h = \X)(\\®q d x (U)®p x (s). (5.18) 

If we now think of Í7sch as a quantum circuit, it will map the Schur basis state |A, qx,Px) Sch to the 
computational basis state |A, q\,p\) with A, ça, and px expressed as bit strings. The dimensions of 
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the irreps pa and vary with A, so we will need to pad the \q\,px) registers when they are expressed 
as bitstrings. We will label the padded basis as |A)|ç)|p), explicitly dropping the A dependence. In 
Chapter 7 we will show how to do this padding efficiently with only a logarithmic spatial overhead. We 
will refer to the transform from the computational basis \ii, 12, ■ ■ ■ , i n ) to the basis of three bitstrings 
|A)|ç)|p) as the Schur transform. The Schur transform is shown schcmatically in Fig. 5-1. Noticc 
that just as the Standard computational basis \i) is arbitrary up to a unitary transform, the bases for 
Q d x and Vx are also both arbitrary up to a unitary transform, though wc will later choose particular 
bases for Qf and V\. 

Example of the Schur transform:Let d = 2. Then for n = 2 there are two vàlid partitions, 
Ai = 2, A2 = and Ai = A2 = 1. Here the Schur transform corresponds to the change of basis from 
the Standard basis to the singlet and triplet basis: |A = (1, 1),ça = 0,p\ = 0)sch = (|01) — |10)), 
|A = (2,0), ç A = +l,Px = 0}sch = [00>, |A = (2,0), ça = :Px = 0) Sch = ^(|01) + |10», and 



|A = (2,0),ç A = 
transformation 



T,Pa = 0)sch = 1 1 1 ) - Abstractly, then, the Schur transform then corresponds to a 



Usch = 



|A= (1,1), ça 
|A=(2,0),ç A = 

|A=(2,0),ç A 
|A=(2,0),ça = 




|00> |01) |10) |11) 



o 


1 


1 


■ 


vi 


V2 


1 














1 


1 





vi 


vi 











1 . 



(5.19) 



It is easy to verify that the A = (1, 1) subspace transforms as a one dimensional irrep of Ui and as the 
alternating sign irrep of S2 while the A = (2, 0) subspace transforms as a three dimensional irrep of U2 
and as the trivial irrep of £2 • Notice that the labeling scheme for the Standard computational basis 
uses 2 qubits while the labeling scheme for the Schur basis uses more qubits (one such labeling assigns 
one qubit to |A), none to \p) and two qubits to |ç)). Thus we see how padding will be necessary to 
dircctly implemcnt the Schur transform. 

To see a more complicated example of the Schur basis, let d = 2 and n = 3. There are again two 
vàlid partitions, A = (3, 0) and A = (2, 1). The first of these partitions labels to the trivial irrep of 
1S3 and a 4 dimensional irrep of U3 . The corresponding Schur basis vectors can be expressed as 



A = 


(3,0), ç A = 


+3/2,p A 


= 0)sch 


A = 


(3,0), ç A = 


+ 1/2,PA 


- 0)sch 


A = 


(3,0), ça = 


-1/2,PA 


- 0) S ch 


A = 


(3,0), ça = 


-3/2,p A 


= 0) S ch 



1 

1 

Vi 



(|001) + |010) + |100>) 
(|011) + |101) + [110)) 



(5.20) 



The second of these partitions labels a two dimensional irrep of 1S3 and a two dimensional irrep 
of U-2 ■ Its Schur basis states can be expressed as 



A = 


(2,1),ça = 


+ 1/2,PA 


= 0) S ch = 


^(1100) 


-1010)) 


A = 


(2,1),ça = 


-1/2,PA 


= 0) Sch = 


71 (|101) 


-1011)) 


A = 


(2,1),ça = 


+1/2.PA 


= l)sch = 


yflooi)- 


|010) + |100) 

Vq 


A = 


(2,1),ça = 


-1/2.PA 


= l)sch = 




|101) + |011) 
V6 



(5.21) 
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We can easily verify that Eqns. (5.20) and (5.21) indeed transform under U2 and 1S3 the way we 
cxpcct; not so easy however is coming up with a circuit that relates this basis to the computational 
basis and generalizes naturally to other vàlues of n and d. However, note that p\ determines whether 
the first two qubits are in a singlet or a triplet state. This gives a hint of a recursive structure that 
we will exploit in Chapter 7 to construct an efficient general algorithm for the Schur transform. 




Figurc 5-1: The Schur transform. Notice how the direct sum over A in Eq. (5.16) becomes a tensor 
product between the |A) register and the \q) and \p) registers. Since the number of qubits needed for 
\q) and \p) vary with A, we need slightly more spatial resources, which are here denoted by the ancilla 
input |0). 



5.3.1 Constructing Q x and V\ using Schur duality 

So far we have said littlc about the form of Qf and T'x, other than that they are indexed by partitions. 
It turns out that Schur duality gives a straightforward description of the irreps of Ud and S n . We 
will not use this explícit description to construct the Schur transform, but it is still hclpful for 
understanding the irreps Qf and V\. As with the rest of this chapter, proofs and further details can 
be found in [GW98]. 

We begin by expressing A G 1d,n as a Young diagram in which there are up to d rows with Ai 
boxes in row i. For example, to the partition (4, 3, 1, 1) we associate the diagram 



U . (5.22) 

Now we define a Young tableau T of shape A to be a way of filling the n boxes of A with the integers 
1, ... ,71, using each number once and so that integers increase from left to right and from top to 
bottom. For example, one vàlid Young tableau with shape (4, 3, 1, 1) is 



1 


4 


G 


7 


2 


5 


8 




3 








9 









For any Young tableau T, define Row(T) to be set of permutations obtained by permuting the integers 
within each row of T; similarly define Col(T) to be the permutations that leave each integer in the 
same column of T. Now we define the Young symmetrizer Hx.t to be an operator acting on (C d )®" 
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as follows: 



n 



dim V\ 



X:T • = 




sgn(c)P(c) 




(5.23) 



It can be shown that the Young symmetrizer H\-.t is a projection operator whose support is a 
subspace isomorphic to Qf. In particular í/schU^rt^sch = ® \y{T))(y(T)\ <E> Iqú for some unit 

vector \y(T)) e V\. Moreover, these vectors \y(T)) form a basis known as Young's natural basis, 
though the \y(T)) are not orthogonal, so we will usually not work with them in quantum circuits. 

Using Young symmetrizers, we can now explore some morè general cxamples of and T'x- If 
A = (n), then the only vàlid tableau is 



1 


2 




n 



The corresponding <S„-irrep Vui) is trivial and the %-irrcp is given by the action of Q on the totally 
symmctric subspace of (C d )® n , i.c. {\v) : P(s)|w) = |w)Vs G S n }- On the other hand, if A = (1™), 
meaning (1,1,..., 1) (n times), then the only vàlid tableau is 



The <S„-irrep 'P(ii) is still one-dimensional, but now corresponds to the sign irrep of <S„, mapping 
s to sgn(s). The %-irrep Q^n) is equivalent to the totally antisymmetric subspace of (C )® n , i.e. 
{\v) : P(s)\v) = sgn(s)|i>)Vs € S n }. Note that if d > n, then this subspace is zero-dimensional, 
corresponding to the restriction that irreps of Ud are indexed only by partitions with < d rows. 

Other explicit examples of Ud and S n irreps are presented from a particle physics perspective in 
[Geo99]. We also give more examples in Scction 7.1.2, when we introduce explicit bases for Qf and 



5.4 Dual reductive pairs 

Schur duality can be generalized to groups other than Ud and S n . The groups for which this is 
possible are known as dual reductive pairs, and in this section we give an overview of thcir dcfinition 
and properties (following Sec 9.2 of [GW98]). The next two chapters will focus primarily on Schur 
duality, but here we give some ideas about how the techniques used in those chapters could be applicd 
to other groups and other representations. 

Supposc G and K are groups with irreps {p^,,U M ) and (cr„, V u ) u& ^ respectively. Then the 
irreps of G x K are given by (p M C3> ov, U^Vu). Now supposc (7, Y) is a rcpresentation of G x K. Rs 
isotypic decomposition (cf. Eq. (5.2)) is of the forní 

Y VpèVv ® C m - , (5.24) 

neó uek 

where the m^^ are multiplicity factors. Define the algebras A = 7(C[Gx{e}]) and B = 7(C[{e} x K\). 
Then [GW98] proves the following generalization of Schur duality: 

Proposition 5.1. The following are equivalent: 

(1) Each m M)1/ is either or 1, and at most one m^ v is nonzero for each /i and each v. In other 
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words, Eq. (5.24) has the form 

W G ^ K 0EW)®W) ( 5 - 25 ) 
xes 

where S is some set and (pe ■ S — > G, <^a' : S ^ K are injective maps. 

(2) B is the commutant of A in End(U) (ï.e. 6={i£ End(V) : [a;, o] = OVa € ^4}J and .4 zs í/ie 
commutant of B. When this holds we say thai A and B arc doublc commutants. 

When these conditions hold we say that the groups j(G x {e}) and 7({e} x K) form a dual 
reductive pair. In this case, Eq. (5.25) gives us a one-to-one correspondence between the subsets 
of G and K that appcar in W . This redundancy can often be useful. For example, mcasuring the 
G-irrep automatically also measures the íí-irrcp. In fact, the kcy idea behind the algorithms we will 
cncounter in Chapters 7 and 8 is that the Schur transform can be approached by working only with 
Wd-irrcps or only with S n irreps. 

Many of the cxamples of dual reductive pairs that are known relate to the orthogonal and sym- 
plectic Lie groups [How89], and so are not immediately applicable to quantum information. However, 
in this section we will point out one example of a dual reductive pair that could arise naturally when 
working with quantum states. 

Let G = Ud A and K = Ud B and define W to be the n th symmetric product of C dA <g> C dB ; i.e. 

W := ({C dA ® C dB )® n f n = ||«) G (C dA ®£ dB )® n : P{s)\v) = |«)Vfl G 5„} . (5.26) 

UdAdB d d 

We have seen in Section 5.3.1 that W = Q^„^ B ■ However, here we are interested in the action 
oíUd A x Ud B on W, which we define in the natural way; i.e. (Ua, Ub) is mapped to (Ua ® Ub)® 11 - 
It is straightforward to show that Ud A and Ud B generate algebras that are doublc commutants. This 
means that W decomposes under Ud A x Ud B as 

U dA xU dB 

W Q dA ®Q d x B , (5.27) 

where d = min(dyi, ds). This yields sevcral nontrivial conclusions. For example, if the systcm werc 
shared between two parties, then this would mean that the states of both parties would have the 
same Young frame. Also, it turns out that applying the Schur transform circuit in Section 7.2 to 
either A or B gives an efficient method for performing the isomorphism in Eq. (5.27). 

The implications of other dual reductive pairs for quantum information are largely unknown. 
However, in principio thcy offer far-ranging generalizations of Schur duality that remain amenable to 
manipulation by the same sorts of algorithms. 



Chapter 6 



Applications of the Schur transform 
to quantum information theory 

In physics, the Schur basis is a natural way to study systems with permutation symmctry. In quantum 
information theory, the Schur basis is well suited to i. i. d. states and channels, such as p®" and J\f® n . 
For example, if p is a d x d density matrix, then p® n decomposes under the Schur transform as 

£W®"E/L= E |AXA|®qftp)®IïV (6-1) 

To prové this, and to intèrpret the qf (p) term, we note that irreps of Ud can also be interpreted as 
irreps of QLd (the group of d x d complex invertible matrices)*. If p is not an invertible matrix then 
we can still express p as a limit of elements of GL^ and can use the continuity of to define (p) . 
Then Eq. (6.1) follows from Eq. (5.18). 

The rest of the chapter will explore the implications of Eq. (6.1) and related equations. We will 
see that the first register, |A), corresponds to the spectrum of p, and indeed that a good estimate for 
the spectrum of p is given by measuring |A) and guessing (Ai/n, . . . , Xd/n) for the spectrum. The Qf 
register depends on the spectrum (A) for its structure, but itself contains information only about the 
cigcnbasis of p. Both of these registers are vanishingly small — on the order of d 2 log n qubits — but 
contain all the features of p. The V\ register contains almost all the entropy, but always carries a 
uniform distribution that is independent of p once we condition on A. 

This situation can be thought of as generalization of the classical method of types, a technique in 
information theory in which strings drawn from i. i. d. distributions are classified by their empirical 
distributions. We give a brief review of this method in Section 6.1 so that the reader will be able to 
appreciatc the similaritics with the quantum case. In Section 6.2, we show how Eq. (6.1) leads to a 
quantum method of types, and give quantitative bounds to make the theory useful. We survey known 
applications of Schur duality to quantum information theory in Section 6.3, using our formulae from 
Section 6.2 to give concise proofs of the some of the main results from the literaturc. Finally, we 
show how Schur duality may be used to decompose i. i. d. quantum channels in Section 6.4. 

Only the last section represents completely new work. The idea of Schur duality as a quan- 
tum method of types has been known for ycars, beginning with applications to quantum hypothcsis 
tcsting[Hay01] and spectrum estimation[KW01], further developed in a series of papers by Hayashi 
and Matsumoto[HM01, HM02c, HM02a, HM02b, Hay02b, Hay02a], extended to other applications 
in [BacOl, KBLW01, BRS03, BRS04, vKK04, HHH05], and reccntly applicd to information theory in 

*This is because GL^ is the complexification of Ud, mcaning that its Lie àlgebra (the set of all d X d complex 
matrices) is equal to the tensor product of C with the Lie àlgebra of Ud (the set of d X d Hermitian matrices). Sec 
[GW98, FH91] for more details. For this reason, mathematicians usually discuss the representation theory of GL^ 
instead of Ud- 
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[CM04] . The contribution of the first three sections is to present these results together as applications 
of the same general method. 

6.1 The classical method of types 

The method of types is a powerful tool in classical information theory. Here we briefly review the 
method of types (following [CT91, CK81]) to give an idea of how the Schur basis will later be used 
for the quantum generalization. 

Consider a string x n = {x\, . . . , x n ) G [d] n , where [d] := {1, . . . , d}. Define the type of x n to be the 
d-tuple of integers t(x n ) := Yïíj=i e Xj , where a G Z d is the unit vector with a one in i th position. Thus 
t(x n ) counts the frequency of each symbol 1, . . . , d in x n . Let TJ 1 := {(ni, . . . , rid) ■ iïi + . . . + rid = 
n, rii > 0} denote the set of all possible types of strings in [d] n (also known as the weak d-compositions 
of n). Since an element of TJ 1 can be written as d numbers ranging from 0, . . . , n we obtain the simple 
bound \T d n \ < (n + í) d . In fact, {T^l = ("J-ÍT 1 )' but knowing the exact number is rarely necessary. 
For a type t, let the normalized probability distribution t := t/n denote its empirical distribution. 

The set of types TJ 1 is larger than the set of partitions Xd,n because symbol freqüències in types do 
not have to occur in decreasing order. In principle, we could separate a type t G TJ 1 into a partition 
A G 1d,n (with nonincreasing parts) and a mapping of the parts of A onto [d], which we call q\. The 
map q\ corresponds to some (ai, . . . , a<j) G Sd for which there are A^ symbols cqual to for each 
i G {1, . . . ,d}. Howcver, if not all the Ai are distinct, then this is morc information than we need. 
In particular, if A^ = A,+i = . . . = \j, then we don't care about the ordering of a», . . . , cij. Define 
rrii(X) to be the number of parts of A equal to i, i.e. \{j : Aj = Then the number of distinct q\ is 
d\/m\\ . . . m n \ =: ( 1. This separation is not usually used for classical information theory, but helps 
show how the quantum analogue of type is split among the |A) and |g) registers. 

For a particular type t G TJ 1 , denote the set of all strings in [d] n with type t by T t = {x n G [d n ] : 
t(x n ) = t}. There are two useful facts about T t . First, \T t \ = (™) := n\/ti\ . . . í^! (or equivalently 
\T t \ = (?) , where A is a sorted version ofí). Second, let P be a probability distribution on [d] and P®" 
the probability distribution on [d] n given by n i. i. d. copies of P, i.e. P® n (x n ) := P(xi) ■ ■ ■ P(x n ). 
Then for any x n G T t we have P® n (x n ) = P(1) É1 • ■ • P(d) td = exp(^=i tjlogP(j)). This has 
a natural expression in terms of the entropic quantities H(t) := —^jtjlogtj and D(t\\P) := 
T,jtj l °Stj/P(j) as 

P® n (x n ) = exp (-n (H(t) + D(t\\P))) . (6.2) 

These bàsic facts can be combined with simple probabilistic arguments to prové many results in 
classical information theory. For example, if we define P® n (T t ) := J2 x n eT t P® n (% n )> then 

t® n (T t ) = (fj exp(-nH(t)). (6.3) 

Since l^ n (Tt) < 1, we get the bound (") < cxp(nH(t)). On the other hand, by doing a bit of 
algebra[CT91] one can show that í®"(T t ) > t® n (T t >) for any t' G T^; i.e. under the probability 
distribution í®", the most likcly type is t. This allows us to lower bound (?) by exp(niJ(t))/|7d in |. 
Together these bounds are 

(n + iy d exp(nH(t)) < \T t \ = (fj < cxp(ni/(í)). (6.4) 

Combining Eqns. (6.4) and (6.2) for an arbitrary distribution P then gives 

(? l + l)- d exp(-n J D(í||P)) <P® n (T t ) < exp (-nD(t\\P)) , (6.5) 
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Thus, as n grows large, we are likely to observe an empirical distribution t that is close to the 
actual distribution P. To formalize this, define the set of typical sequences Tp 1 s by 

T£ 4 := |J T t . (6.6) 

t£T d . n 
||í-P||i<5 

To bound P (g, "(T™ (5 ), we apply Pinsker's inequality[Pin64]: 

D(Q\\P)>l\\P-Q\\l (6.7) 
Denote the complement of Tp S by [d] n — Tp s . Then 



p®n {[d] n_ T n s)= £ P®»(T t ) < £ CX P (~nDÇt | 1 P)) < (« + l) d CX P (- 

+ t-T, +t-T. * 



\t-P\\i>8 ||í-P||i><5 



no 
~2~ 



(6.8) 



and thercforc 



P^(T^)>l-(n+l) d cxp(-^-) 



This has sever al useful consequences: 

• Estimating the probability distribution P: If the true probability distribution of an i. i. d. process 
is P and we observe empirical distribution t on n samples, the probability that ||í — P||i > S is 

< (n + l) d exp ^— ^j-^, which decreases exponentially with n for any constant valuc of S. 

• Data compression (cf. Eq. (1.28)): We can comprcss n lcttcrs from an i. i. d. source with 
distribution P by transmitting only strings in Tp s . Asymptotically, the probability of error is 

< (n+l) d exp (^— ^f-^ , which gocs to zero as n , — > oo. The number of bits rcquired is [loglTp^l]. 
To estimate this quantity, use Fannes' inequality (Lemma 1.1) to bound 

\H(t) -H(P)\ < ri(5) + Slogd (6.10) 

whenever ||í — P\U < S. Thus 



log |T5 4 | < log |T d ,„| + n [H(P) + v (6) + Slogd] < 



H(P)+r](S)+6logd+-log(n + l] 



ii 



which asymptotically approachcs H{P) bits per symbol. 



(6.11) 



• Randomness concentration (cf. Eq. (1.29)): Suppose we are given a random variable x n dis- 
tributcd according to P® n and wish to produce from it somc uniformly distributcd random 
bits. Then sincc all x n with the same typc havc the same probability, conditioning on the typc 
t = t(x n ) is sufhcient to give a uniformly distributed random variable. According to Eqns. (6.10) 
and (6.9), this yields > n(H(P) - r}(6) - 5\ogd) = n(H(P) - o(l)) bits with probability that 
asymptotically approachcs one. 

If we have two random variables X and Y with a joint probability distribution P(X,Y), then 
we can define joint types and jointly typical sequences. These can be used to prové more sophis- 
ticated results, such as Shannon's noisy coding theorem[Sha48] and the Classical Reverse Shannon 
Theorem[BSST02, Win02]. Reviewing classical joint types would take us too far afield, but Section 6.4 
will dcvclop a quantum analogue of joint types which can be applicd to channcls or noisy bipartite 
states. 



128 



6.2. SCHUR DUALITY AS A QUANTUM METHOD OF TYPES 



Let us now summarize in a manner that shows the paral·lels with the quantum case. A string 
x n £ [rf] n can be expressed as a triple (\,q\,p\) where A £ Z d ,n, qx € Qx and p\ £ P\ for sets Qx 
and P\ satisfying |Qa| < poly(n) and exp(nH(\))/ poly(n) < |Pa| < exp(nH(X)), if wc think of d 
as a constant. Furthermore, permuting x™ with an element of S n affects only the p\ rcgister and 
for / £ Sd, the map x n — * (/(xi), . . . ,f(x n )) affects only the ça rcgister. This corresponds closcly 
with the quantum situation in Eq. (6.1). Wc now show how dimension counting in the quantum case 
rcscmbles the combinatòries of the classical mcthod of types. 



6.2 Schur duality as a quantum method of types 

In this section, we generalize the classical method of types to quantum states. Our goal is to give 
asymptotically tight bounds on Q d and V\ and the other quantities appearing in Eq. (6.1)(following 
[GW98, Hay02a, CM04]). 

First recali that \Id,n\ < \1~d,n\ = (n + l) d = poly(n). For À £ Id,m define A := A + (d — 1, d — 
2, . . . , 1, 0). Then the dimensions of Q d x and V\ are given by[GW98] 

dimQ , = ni<^< d (A^A,) (gi2) 

IIm=l ml 

71 ' t — r — — 

dimpA = ~ ~ ' ~ (Ai — Aj) (6.13) 

Ai!A 2 !···A d ! i^^xd 

It is straightforward to bound these by[Hay02a, CM04] 

àim QÍ < {n + d) d ^ d - 1 ^ 2 (6.14) 

Q(n + rf)- d(d - 1)/2 <dhaVx < (fj. (6.15) 

Applying Eq. (6.4) to Eq. (6.15) yiclds the more uscful 

exp (nH(X)) (n + d)- d{d+l)/2 < dimVx < exp (niT(A)) . (6.16) 

We can use Eq. (6.1) to derive a quantum analogue of Eq. (6.2). To do so, we will need to 
better describe the structure of Q d . Define the torus U* d = Ui X . . . U\ C Ud as the subgroup of 
diagonal matrices (in some fixed basis of C d ). For x £ C d let diag(x) denote the diagonal matrix 
with entries x±, . . . ,Xd- The (one-dimensional) irreps of U^ d are labeled by \i £ ï d and are given 
by x" := x^ 1 ■ ■ -Xj d . We will be interested only in fi with nonnegative entries, and we write lA 
to denote this set (note that this is diffcrcnt from because the components of \x can be in any 
order). 

If (Qj Q) is a polynomial representation oïUd, then upon restriction to U* d one can show that it 
breaks up into orthogonal subspaces labeled by different [i £ Z+. The subspace corresponding to the 
W-j^-representation fi is callcd the /i-weight space of Q and is denoted Q(^). Formally, we can define 
Q(/i) C Qby Q(p) := {\q) £ Q : q(diag(xi, . . . , x d ))\q) = xf • • • x" d |<z) Vx u ...,x d £ C\{0}}. For 
examplc (C d )® n (n) = Span{|x") : x n £ T^}. 

To describe the weight spaces of Qf we define the Kostka coefficient Kx^ := dim Qf(fJ.) (as can be 
easily checked, Kx^ depends on d only through A and /i) [GW98] . Whilc no uscful formula is known 
for K X fj,, they do satisfy 



£ M i^ = dim<2 



• Kxfj, ^ if and only if fj, -< A, meaning that = |A| and Xa=i Mi ^ Sí=i ^» for c = 1, . 
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If we order weights according to the majorization relation -<, then there exists a highest-weight vector 
spanning the one-dimensional space Q^(A). At the risk of some ambiguity, we call this vector |A). 
We will also definc an orthonormal basis for Qf, denotcd Qf, in which cach basis vector lics in a 
single weight space. This is clearly possible in general, and also turns out to be consistent with the 
basis we will introduce in Section 7.1.2 for use in quantum algorithms. To simplify notation later on, 
whenever we work with a particular density matrix p, we will choose the torus U* d to be diagonal 
with respect to the same basis as p. This means that q A (p) is diagonalized by Qf, the induced weight 
basis for Q d . 

We now have all the tools we need to find the spectrum of q A (p). Let the eigenvalues of p be 
given by ri > • • • > r<j (we sometimes write r = spec p) . Then for all p S TJ 1 , (p) has eigenvector 
r M _ . . . r iM w [^]j nrultiplicity ív Am . The highest eigenvalue is r A = exp[— n(H{\) + D(\\\r))] (since 
r is nonincreasing and p -< A for any p with ^ 0). Thus we obtain the following bounds on 
Tr di(p): 

r x < Tvq d (p) = Yl ^ rX dim 2a- (6-17) 

To relate this to quantum states, let II A denote the projector onto Q d (g> T'x C (C d )® n . Explicitly 
Tl\ is given by 

Ha = Ul h (|A)(A| ® I Q d <E> I Vx J U Sch . (6.18) 
From the bounds on dim Qf and dim T'à m Eqns. (6.14) and (6.16), we obtain 

exp (nií (A)) (n + d)- d(d+1)/2 < Tr n A < exp (nH (A)) (n + d) d{d - 1)/2 (6.19) 
Also Ti-nAp^UA = Trqf(p) ■ dimVx, which can be bounded by 

exp (-nD(X\\r)) (n + d)- d{d+í ^ 2 < TrII A p®™ri A < exp (-nD(\\\r)) (n + rf)^^ 1 )/ 2 (6.20) 
Similarly, we have 

n xP ® n = P ®"n A = n AP ®"n A < r A n A = exp [-n(iï(A) + £>(A||r))]n A . (6.21) 

For some vàlues of p, can be much smaller, so we cannot express any useful lower bound on 
the eigenvalues of n A/ o®"n A , like we can with classical types. Of course, tracing out Qf gives us a 
maximally mixed state in V\, and this is the quantum analogue of the fact that P® n (-\t) is uniformly 
distributed over T t . 

We can also define the typical projector 



X:XeBs(r) 



A:AeB a (r) 



C/sch, (6.22) 



where Bs{r) := {A : ||A — r||i < 5}. Using Pinsker's inequality, we find that 

TrlE>®" > 1 - exp í-^) (« + d) d(d+1)/2 , (6.23) 

similar to the classical case. The typical subspace is defined to be the support of the typical projector. 
Its dimension can be bounded (using Eqns. (6.23) and (6.10)) by 

Trií™ 5 < |X d „|_max TrII A < (n + d) d{d+1)/2 exp(nií(r) + n(5) + Slogd), (6.24) 

XeB s (r) 
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which is suficient to derive Schumacher compression (cf. Eq. (1.24)). 

The bounds dcscribcd in this section are fairly simple, but are already powcrful enough to derive 
many results in quantum information theory. Before discussing those applications, we will describe 
a variation of the decomposition of p® n given in Eq. (6.1). Suppose we are given n copies of a pure 
state \if>) AB where p A = Tr# i\) AB . This situation also arises when we work in the CP formalism (see 
Section 1.1.1). Purifying both sides of Eq. (6.1) then gives us the alternate decomposition 

(ui ch ®u B h m AB r n = J2 c x \\) A i\\) B ^\ qx ) A ^®\<ï> P y^ (6.25) 

Here c\ are coeficients satisfying | c> | 2 = Trll^p®™, \q\) are arbitrary states in an d |$-p A ) is a 
maximally entangled state* on V\(>~)V\. 

6.3 Applications of Schur duality 

The Schur transform is uscful in a surprisingly largc number of quantum information protocols. Here 
we will review thcse applications using the formulae from the last section to rederive the main results. 
It is worth noting that an eficient implementation of the Schur transform is the only nontrivial step 
necessary to perform these protocols. Thus our construction of the Schur transform in the next 
chapter will simultaneously make all of these tasks computationally eficient. 

Spectrum and state estimation 

Suppose we are given many copies of an unknown mixed quantum state, p® n . An important task is 
to obtain an estimate for the spectrum of p from these n copies. An asymptotically good estimatc 
(in the sense of large deviation rate) for the spectrum of p can be obtained by applying the Schur 
transform, measuring A and taking the spectrum estimate to be (\i/n, . . . , Ad/n)[KW01, VLPT99]. 
Indeed the probability that ||A — spccplji < 8 for any S > is bounded by Eq. (6.23). Thus an 
eficient implementation of the Schur transform will efficicntly implcmcnt the spectrum estimating 
protocol (note that it is eficient in d, not in log(d)). 

The more general problem of estimating p reduces to measuring | A) and Qf, but optimal estimators 
have only been explicitly constructed for the case of d = 2[GM02]. One natural estimation scheme is 
given by first measuring A and then performing a covariant POVM on Qf with POVM elements 

qÍ(U) \X)(X\ qÍ(U)ï dini Q{ dU, (6.26) 

where |A) is the highest weight vector in Qf and dU is a Haar measure for Ud- The corresponding 
state estimate is then p = U ÍX^f=i Ai |*)(*|^ . In this estimation scheme, as n — > oo the probability 
that \\p — p\\i > S seales as exp(— nf{8)) with /(<5) > whenever 5 > 0; [Key04] proves this and 
derives the function f(5). However, it is not known whether the f(5) obtained for this measurement 
scheme is the best possible. 

A related problem is quantum hypothesis testing (determining whether one has been given the 
state p® n or some other state). An optimal solution to quantum hypothesis testing can be obtained 
by a similar protocol [Hay02b]. 

Universal distortion-free entanglement concentration 

Let \iP)ab be a bipartite partially entangled state shared between two parties, A and B. Suppose 
we are given many copies of IV^ab and we want to transform these states into copies of a maximally 
entangled state using only local operations and classical communication. Furthcr, suppose that we 

*In fact, we will see in Section 6.4.1 that \Q-p x ) ' s tmicniely determined. 
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wish this protocol to work when neither A nor B know the state \tp) AB . Such a scheme is called 
a universal (meaning it works with unknown states \ip) AB ) cntanglcment concentration protocol, 
as opposed to the original entanglcment concentration protocol described by [BBPS96]. Further 
we also would like the scheme to produce perfect niaximally entanglcd states, i.e. to be distortion 
free. Universal distortion- free entanglcment concentration can be performed[HM02c] by both parties 
performing Schur transforms on their n halves of \ip) AB , measuring their |A), discarding Qf and 
retaining V\. According to Eq. (6.25), the two parties will now share a niaximally entangled state of 
dimension dim 7^, where A is observed with probability dimT'x • Trq^(Tr B |^)(^|). 

According to Eqns. (6.23), (6.10) and (6.19), this produces at least n(S(p) — rj(5) — Slogd) — ^d(d+ 

1) log(n + d) ebits with probability > 1 — exp (~ íl ! - ) ( n + d) d< ^ d+1 ^ 2 . The rate at which this error 
probability vanishes for any fixed 5 can be shown to be optimal among protocols of this form[HM02c]. 



Universal Compression with Optimal Overflow Exponent 

Measuring |A) weakly so as to cause little disturbance, togcther with appropriate relabeling, compriscs 
a universal compression algorithm with optimal overflow exponent (rate of decrease of the probability 
that the algorithm will output a state that is much too large) [HM02a, HM02b] . 

Altcrnatively, suppose we are given R s.t. H{p) < R and we want to compress p® n into nR qubits. 
Define the projector H R by 

Xeld.n 
H(X)<R n 

where R n := R— \d(d+ 1) log(n + d). Since Tr IT^ < exp(niï), projecting onto 11^, allows the residual 
state to be compressed to nR qubits. The error can be shown to be bounded by 



< (n + d) d ^ d+1) ' 2 exp 
which decreases exponentially with n as long as R > H{p) 



n min D(_P|| spccp) 

P:H(P)>R n 



(6.28) 



Encoding and decoding into decoherence-free subsystems 

Further applications of the Schur transform include encoding into decoherence-free subsystems [ZR97, 
KLV00, KBLW01, BacOl]. Decoherence-free subsystems are subspaces of a system's Hilbert space 
which arc immune to decoherence due to a symmctry of the system-environment interaction. For 
the case where the environment couples identically to all systems, information can be protected from 
decoherence by encoding into the \p\) basis. We can use the inverse Schur transform (which, as a 
circuit can be implemented by reversing the order of all gate elements and replacing them with their 
inverses) to perform this encoding: simply feed in the appropriate |A) with the state to be encoded 
into the V\ register and any state into the Qf register into the inverse Schur transform. Decoding 
can similarly bc performed using the Schur transform. 

This encoding has no error and asymptotically unit efficiency, since log max^ dim V\ qubits can 
be sent and max A dim Pa > d n /(\l dt „\ max A dim Qf) > d n (n + d)" d ( d + 1 )/ 2 . 



Communication without a shared reference frame 

An application of the concepts of decoherence-free subsystems comes about when two parties wish to 
communicate (in either a classical or quantum manner) but do not share a reference frame. The effect 
of not sharing a reference frame is the same as the effect of collective decoherence: the same random 
unitary rotation is applied to each subsystem. Thus encoding information into the V\ register will 
allow this information to be communicated in spite of the fact that the two parties do not share a 
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reference frame[BRS03]. Just as with decoherence-free subsystems, this encoding and decoding can 
be done with the Schur transform. 



6.4 Normal form of memoryless channels 

So far we have has only discussed the decomposition of p® n , or equivalently, of pure bipartite entangled 
states. However many interesting problems in quantum information theory involve what are effectively 
tripartite states. Not only are tripartite states \ip) ABC interesting in thcmsclvcs[Tha99], they also 
appear when a noisy bipartite state p AB is replaced by its purification \ip) ABE and when a noisy 
quantum channcl M A ^ B is replaced by its purification U^~* BE . When considering n copies of these 
rcsources, much of thcir structure can be understood in tcrms of the vector spaces (V\ A ® V\ B ® 
Pae) 5 "- Wc cxplain how this follows from the S n CG transform in Section 6.4.1 and then apply 
this to quantum channels in Section 6.4.2. Finally, we gcncralize the bounds from Section 6.2 to a 
quantum analogue of joint typicality in Section 6.4.3. 



6.4.1 The S n Clebsch-Gordan transformation 

We begin by describing how the CG transform (cf. Section 5.2.2) specializes to S n . For Xa, Xb el„, 
Eq. (5.4) implies 

V\ A ®Vx B ^ Vx c ®Hom(Tx c ,Vx A ®Vx B ) Sn ^ Px a <8 C g ^ B ^o (6.29) 
A c ei„ A c ei„ 

Herc wc have defined the Kronecker coefficient gx A x B x c : = dimHom('PAcj7 : 'AA ®^ B ) S "- 

It can be shown that there is an orthonormal basis for T'x, which wc call P\, in which Pa(s) are 

real and orthogonal.* This means that Vx — V\. Since Hom(A, B) = A* ® B, it follows that 

Hom(7>A c , Vx A ® Vx B f n S {Vx A ® Vx B ® V Xc f n ■ (6.30) 

As a corollary, gx A x B x c i s unchanged by permuting Xa, Xb, Xc- Unfortunately, no efficient method 
of calculating 3a4A b a c i s known, though asymptotically they have some connections to the quantum 
mutual information that will be investigated in future work. The permutation symmetry of gx A x B x c 
also means that we can consider CG transformations from AB — > C, AC — > B or BC — > A, with the 
only difference being a normalization factor which wc will explain bclow. 

According to Eq. (6.30), the CG transformation can be understood in terms of tripartite S n - 
invariant vectors. Let |a) be a unit vector in (Paa ® ®Vx c ) Srl , with corresponding density 
matrix a = |a)(a|. Since a A := Trsc a is invariant under permutations and Tra = 1, Schur's 
Lemma implies that a A = I-p x / 'D a, with Da '■= dimT'AA- This means we can Schmidt decompose 
\a) as 

\a) ABC = ^= \pa) A W a '-> bc \ Pa ) A ' (6.31) 

^ Ua PAeP XA 

where W a € üotí\(T'x a , Vx B ^Vxc ) Sn 18 an isometry. We can express ' c in tcrms of W a according 



*One way to prové this is to consider Young's natural basis, which was introduced in Section 5.3.1. Since the T1\ : t 
produce real linear combinations of the states . . . ,i n ), the matrices Pa(s) are also real when written in Young's 
natural basis. If we generate an orthonormal basis by applying Gram-Schmidt to Young's natural basis, the matrices 
P\(s) remain real. 

In Section 7.1.2 wc will introducc a different orthonormal basis for S n , known as Young's orthogonal basis, or as the 
Young-Yamanouchi basis. [JK81] gives an explícit formula in this basis for p\(s) in which the matrices are manifestly 
real. 
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to 

(U^ x ^\a)\p A ) = W a \p A ). (6.32) 

The simple form of Eq. (6.31) suggests that the CG transformation can also be implemented by 

_ i 

teleportation. For any Aeí„, let D\ := dim'P.x and define |$a) = D x 2 J2 P eP x \p)\p)- Note that up 
to a phasc |$x) is the unique invariant vector in Vx ® Vx- To see that it is invariant. use the fact 
that (A ® I)\$>\) = (I <& ^l' 7 " ) | <E> a ) for any operator A and the fact that pa arc orthogonal matrices, 
so p\(s) T = pa(s) -1 - Uniqueness follows from 

dha(P x ® Pa) 5 " = dimHom(P A , 'Pa) 5 " = 1, (6.33) 

whcrc the first cquality is because V\ = V x and the second cquality is duc to Schur's Lcmma. 

We use |$a) f° r teleportation as follows. If \p A ) G V\ A is a basis vector (i.e. \p A ) G -PO, then 
($a| j4j4 Iïm)" 4 = g (p A I ■ Also Eq. (6.31) can be written more simply as 

\a) ABC = (l A ® W A '^ BC ) \<ï>x A ) AA '. (6.34) 
Combining Eqns. (6.34) and (6.32) then yields 

{®X A \ AA '\a) A ' BE \p A ) A = ^W A ^ BE \p A ) A = ±-{U X ^\a)\p A ) (6.35) 

This connection between the quantum CG transform and <S„-invariant tripartite states will now 
be used to decompose i. i. d. quantum channcls. 



6.4.2 Decomposition of memoryless quantum channels 

Let Aí : A' — > B be a quantum channel and Uj^ : A' — * BE its isometric extension. Let d A = 
dim A' , ds = dïmB, Úe = àmxE and d := max(d A , ds, dg). We want to consider n uses of E/jv in the 
Schur basis. In general this has the form 

Us ch UTul h = E \^e)(X A \ £ \ qB q E Kq A \ £ \PbPe){ PA \ C^lf^ E P B P^ 
Xa,Xb,XbEXn qA&Q\ A ,qB£Q\ B pa&P\ a ,pb&P\ b 

1e<íQ\ e Pe£P\ e 

(6.36) 

for some coeficients C Xa Í ap „ a „ _ _ ■ So far this telis us nothing at all! But we know that Uf r n is 

*b ^eQbQePbPe g A/ 

invariant under permutations; i.e. [P(s _1 ) s <8> P^" 1 )^ [7$™P(s)^ = U% n for all s G S n . Thus 

Ï7$" G Hom(C^,C d S®C d -)' s " ""é* 5 " Hom(Q^, G**® Q d x E E )<È>Rom(Vx A ,Vx B ®Vx E ) S " 

(6.37) 

Let P[Aa; A_b, A_e] be an orthonormal basis for ïíom(Vx A , Vx B ® Pa e ) 5 "- Then we can expand 
t/sch^r^sch as 

U Scb U® n ul h = E [^]ABr EteïB J A B^)(AA|®k^)(çA|®Wa, (6.38) 

X A ,X B ,X E £l n ,a£P[X A ;X B ,X E ] 

q AeQÍ A , qB eQÍ%, qE eQÍ% 
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where the cocfficicnts [Vff 



^B^EQBQE 



correspond to an isometry; i.e. 



En V n^ AqA \*\l/n]>' A >l A 
\\. v N\\ B \ B q B q E ct> i v N\\ B \ E 



B^EqBqEOt 



8\ A ,\'Jq A ,q' A - 



\B-,^E,qB,qE,a 
This is depicted as a quantum circuit in Fig. 6-1. 



(6.39) 



|Aa) 



QA) 







1 


» 


i 


» 














— \a) — 


r >i 











|A B > 
|A £ > 
\Qb) 
\Qe) 
\pb) 

\pe) 



Figure 6-1: The quantum channel U?r n is dccomposed in the Schur basis as in Eq. (6.38). Alicc inputs 
an n qudit state of the form |Aa)|<L4)|pa) and the channel outputs superpositions of \Xb)\ob)\pb) for 
Bob and |Àb)|5b)|j5b) for Eve. The intermediate state \a) belongs to Hom(T'\ A , V\ B ®V\ E ) Sn . 

Using Eqns. (6.32) and (6.35), we can replace the CG transform in Fig. 6-1 with a teleportation- 
like circuit. Instead of interpreting a as a member of Hom(Vx A , T\ B <8> V\ E ) S " , we say that \a) £ 
(V\ A <8> V\ B <8> V\ E ) Sn . This has the advantage of making its normalization more straightforward and 
of enhancing the symmetry between A, B and E. The Uç G then becomes replaced with a projection 
onto |$Aa)- Since this only succeeds with probability 1/Da, the resulting state needs to be normalizcd 
by multiplying by y 'D ' a- The resulting circuit is given in Fig. 6-2. 



6.4.3 Jointly typical projectors in the Schur basis 

The channel dccomposition in the last section is still cxtrcmely general. In particular, the structure 
of the map is given by the Xa, As and Xe which appear in Eq. (6.38), but gcncrically all of the 
cocfHcients will be nonzero. However, for large vàlues of n, almost all of the weight will be contained 
in a small set of typical triples of (Xa, Xb, Xe)- These triples are the quantum analogue of joint types 
from classical information theory. 

In this section we show the existence of typical sets of (A^, Xb, Xe) onto which a channcl's input 
and output can be projected with little disturbance. In fact, we will define three versions of the 
typical set T$ and show that they are in a certain sense asymptotically equivalent. For each version, 
let p A be an arbitrary channel input, and \^) ABE = (I A ® U$^ BE )\$ P } AA ' the purified channel 
output (following the CP formalism). Now define R(AÍ) to be set of ip ABE that can be generated in 
this manncr. 

• Define T£ := {(r A , r B , r E ) ■ 3^ ABE € R(Af) s.t. r A = spec(ip A ),r B = spec(if; B ), r E = 
spec(^ B )}. This set is simply the set of triples of spectra that can arise from one use of 
the channel. It has the advantage of being easy to compute and to optimize over, but it doesn't 
give us direct information about which vàlues of (Xa, Xb, Xe) we need to consider. 
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|Aa) 



\QA 




|Ab) 
\Xe) 
\Qb) 
\Qe) 




\pa) 



Figure 6-2: The quantum channel Ujfr" is decomposed in the Schur basis with tclcportation replacing 
the S n CG transform. Hcre the intermediate state \a) bclongs to (V\ A ® V\ B ® V\ E ) Sn , the box 
labeled ^/Da{^x a | rcprcscnts projecting onto the maximally entangled state \ <&\ A ) and normalization 
requires multiplying the residual state by \J Da, where Da '■= à.mxV\ A . 



• Definc 7#(e) := {(Xa, Xb, Xe) ■ ^ ABE G R{M) s.t. Tr(ÏL A A <g> nf B <g> üfJV®" > e}. 

This set telis us which (A^, As, A^) we need to consider when working with purified outputs 
of U® n . To sce this note that if i/j € R{M), then projecting onto 7^(e) will succccd with 
probability > 1 — e(n + l) 3d since there are < (n + l) 3d possible triples (Xa, Xb, Xe)- 

• Define 7aH £ ) to be the set of (Aa, As, Xe) s.t. there exists a subnormalized density matrix uja 
on Q d A (i.e. Tr u>a < 1) s.t. 

Tr(|A B )(A B |® \X e )(X e \®Iq Xb ® Iq Xe ® Ia)U^(|A A )(A A | ® w A ) > c (6.40) 

Since that complctcly determines the map from Aa i— > (As, Xe), we don't need to consider 
diffcrent vàlues of the \pa) register. 

This set is useful whcn considering channel outputs in the CQ formalism. It says that if the 
input is encoded in Q d x A A <X> V\ A then only certain output states need be considered. 

All of these sets could also be generalized to include possible Qf states as well. However, we focus 
attention on the (Xa, Xb, Xe) since those determine the dimensions of V\ and hence the possible 
communication rates. 

We claim that the three typical sets described above are close to one another. In other words, 
for any element in one typical set, the other sets have nearby elements, although we may have to 
decrease e. Here "nearby" means that the distance goes to zero for any fixed or slowly-decreasing 
value of e as n — > oo. 

In the following proofs we will frequently omit mentioning C/sch, implicitly idcntifying p® n with 
U Sc hP® n Ul h and U% n with U Sch U% n Ul h . 

• Tjlj- =$> 7aH £ ) (i- e - f° r an y triple in Tfc there is a nearby triple in T$(e)) 

Proof. Suppose (ta, tb,^e) € Tp and let tp ABE be the corresponding state in R(AÍ) whose re- 
duced states have spectra ta, tb and te- Define the probability distribution Pr(AA, Xb, Xe) '■= 
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Tr(n£ A ®nf B ®nfJV>®". ThenbyEq. (6.23), Pr(i||r A -A A ||i > 6) < {n+d) d{ - d+1 ^ 2 exp(-n<5 2 ) 
for any S > 0. Rcpcating this for A B and A B , we find that 



Pr 



l\\rA-XA\\i>s]v(^\\r B 



X B \\x>S ) V( -||r B - A B ||i><5 



< 3(n+c2)^~~ exp(-r«5 2 ). 



Since the number of triples (Aa,A b ,A b ) is < (n + l) 3d , this means there exists a triple 
(X A , A s , A b ) with Pr(A A , A B , A B ) > (n + l)" 3d (l - 3(n + d) d < d+1 )/ 2 exp(-n<5 2 )) =: e (and so 
(Aa,A b ,A b ) € TJ}(e)), satisfying |||rA - Aa||i < S, \\\r B -A B ||i < 5 and \\\r E - X E \\x < S. 
One natural choice is to takc <5 = (logn)/ y/n and e = 1/ poly(n). □ 



Proo/. Suppose (Aa,A b , A b ) e fj}(e), meaning that there exists i(j abe € R(j\í) s.t. Tr(IT^ A 



Ah 



)r/>® n > e. Thus if we set p A = Tr B e ^ then 



e < T*(n A A ®n* B ®nïj[(i A ®uiï^ BE )\$ p ) AA '\ (6.4i) 

= Tr(n AB ®n AE )^"(n A , P ^n A j (6.42) 

= Tr(n AB ®n AE )ü^"(|A A )(A A |®q AA (p)®/^ ji ) (6.43) 

= Tr(|A B )(A B |®/ QAB ® |A £ )<A £ |®/ QAE )V^(|A A )(A A |®q A ,( í0 )·dim7' A J (6.44) 

= Tr(|A B )(A B |®7 SxB ® |A b )(A £ |®/ Qae ) V^(|Aa>(Aa| ® w a ) (6.45) 



In the last step we have defined the (subnormalized) density matrix uja '■= q AA (/o) • dimV\ A . 
(It is subnormalized becausc Ttlü A = Tr II Aa p®" < 1.) Thus (Aa, A b , A b ) e 7#(e). □ 



TX(e)ÇTj!}(ei),ei = e(n + d)- d2 



Proof. If (Aa, A b , A b ) £ Tj^-(e) then there exists a density matrix uja on Q Aa s.t. 



Tr(n Afl (£)Il\ E )Af \X A )(X A \ ®íü A 



> €. 



dim V\ 

In fact, this would remain true if we replaced I-p x / dim'P AA with any normalized state. 



(6.46) 



Definc po = J2i=i Xa,í\í) (i\ and let dU denote a Haar measure on Ud A - By Schur's Lemma, 
averaging q^(ÍT pqW) over dU gives a matrix proportional to the identity. To obtain the 
proportionality factor, we use Eq. (6.20) to bound 



P := Trn AA pf"n AA = T¥q AA (p ) ■ dimP AA > (n + d)-^ 1 )/ 2 . 
Upon averaging, we then find that 



fí XA 

dim Q djS 

A/ 



= í dUq d x A A (Up UÍ)-dimP Xj 



(6.47) 



(6.48) 
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Now uj a < I q q , so Aí 9n (\\ a ){\a\ ®uj a ® dï!^) < (| a aXAa| ® / q -a ® dïï^) and 

e < TrAT*» (|A A )(A A | ® cü a ® ) (ü Ab ® l·l Ag ) (6.49) 

< TrAA«" Í|A A )(A A |®/ S ^ 



dim TV 



(n As ®n AE ) (6.50) 



dC/Tr(^"(n A4 (c/p c/f n n Aj ))(n AB8 n AE ) (6.51) 



dim Q dA 

< max -^Tr(AA® n (n AA ([/ i ooí7 í )®™nA A ))(nA B ®n AB ). (6.52) 

u p 

In the last step we have used the fact that J dll = 1 so that J dUf(U) < max^ f(U) for 
any function on Ud A - Therefore 3p = Up W with if) ABE = (I A ® U$~> BE )\<5> P ) AA ' such that 
Tr(n^ <g> U B B <g> üfJV®" > e/3/ dim Q d x \ > e(n + d)"? =: é. 

This mcans that (X A , X B , Xe) G TÀ/OO- D 
Proof. Again, we are givcn 

^abe g and a triple (a a ,A b ,Ae) s.t. Tr^^ (g) nf B (g) 

nf fl )V>® n > e. And again we define Pr(A A , X B , X E ) ■= Tr{U A A <g> l·lf B <g> IlfjV®"- Now let 
S := maxx e {A.s,_E} |||Ax — spec0 x ||i and use Eq. (6.23) to bound 

e < Pr(A,4, Ab, A~e) < (n + d) d{ - d -^' 2 cxp(-ri<5 2 ). (6.53) 

Thus {rA 7 rB,rE) = (specip A ,speci/> B ,spec!/' í; ) G Ta/ 1 and sa tisfies èll rj4 — ^-aIIi — l\\ r B — 
As||i < 5 and ^||rg — A^Hi < S for (5 s.t. 

§2 < ®log(n + <Q + logl/ £ 
n 

□ 



The prcccding set of proofs establishes more than will usually be necessary. The main conclusion 
to draw from this section is that one can project onto triples (Xa, Xb, Xe) that are all within 5 of 
triples in Ttj- while disturbing the state by no more than poly(n) exp(— nS 2 ). 



6.4.4 Conclusions 

The rcsults of this chaptcr should be thought of laying the groundwork for a quantum analogue 
of joint types. Although many coding thcorcms have bcen provcd for noisy states and channcls 
without using this formalism. hopcfully joint quantum types will give proofs that are simpler, more 
powerful, or not feasible by other means. One problem for which the technique seems promising is 
the Quantum Reverse Shannon Theorem[BDH + 05], in which it gives a relatively simple method for 
efficiently simulating a noisy quantum channel on arbitrary sources. It remains to be seen where else 
the techniques will be useful. 
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Chapter 7 



Efficient circuits for the Schur 
transform 

The previous chapter showed how the Schur transform is a vital ingredient in a wide variety of coding 
theorems of quantum information theory. However, for these protocols to be of practical value, an 
efficient (i.c. polynomial timc) implcmentation of the Schur transform will be necessary. 

The goal of performing classical coding tasks in polynomial or even linear time has long been 
studied, but quantum information theory has typically ignored qüestions of efficiency. For exam- 
ple, random coding results (such as [Hol98, SW97, BHL + 05, DW04]) require an exponential num- 
ber of bits to describe, and like classical random coding techniques, do not yicld emeient algo- 
rithms. There are a few important exceptions. Some quantum coding tasks, such as Schumacher 
comprcssion[Sch95, JS94], are esscntially equivalent to classical circuits, and as such can be performed 
cmcicntly on a quantum computer by carcfully modifying an efficient classical algorithm to run re- 
vcrsibly and deal properly with ancilla systcms[CD96]. Another cxamplc, which illustratcs some of 
the challcnges involved, is [KM01]'s efficient implcmentation of cntanglcmcnt concentration[BBPS96]. 
Quantum key distribution[BB84] not only runs efficiently, but can be implcmented with entirely, or 
almost entirely, single-qubit operations and classical computation. Fault tolerance[Sho96] usually 
seeks to perform error correction with as few gates as possible, although using teleportation-based 
techniques [GC99, Kni04] computational efficiency may not be quite as critical to the threshold rate. 
Finally, some randomized quantum code constructions have been given efficient constructions using 
classical derandomization techniques in [AS04]. Our efficient construction of the Schur transform 
adds to this list a powerful new tool for finding algorithms that implement quantum communication 
tasks. 

From a broader perspective, the transforms involved in quantum information protocols are impor- 
tant because they show a connection between a quantum problem with structure and transforms of 
quantum information which exploit this structure. The theory of quantum algorithms has languishcd 
relative to the tremendous progress in quantum information theory due in large part to a lack of ex- 
actly this type of construction: transforms with interpretations. When we say a quantum algorithm 
is simply a change of basis, we are doing a disservice to the fact that efficient quantum algorithms 
must have efficient quantum circuits. In the nonabelian hidden subgroup problem, for example, it is 
known that there is a transform which solves the problem, but there is no known efficient quantum 
circuit for this transform[EHK97]. There is great Ímpetus, therefore, to construct efficient quantum 
circuits for transforms of quantum information where the transform cxploits some structure of the 
problem. 

We begin in Section 7.1 by describing explicit bases (known as subgroup- adapted bases) for the 
irreps of the unitary and symmetric groups. In Section 7.2, we show how these bases allow the Schur 
transform to be decomposed into a series of CG transforms and in Section 7.3 we give an efficient 
construction of a CG transform. Together these three sections comprise an efficient (i.e. running 
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time polynomial in n, d and logl/e for error e) algorithm for the Schur transform. 

7.1 Subgroup-adapted bases for Qf and 7^ 

To construct a quantum circuit for the Schur transform, we will need to explicitly specify the Schur 
basis. Since we want the Schur basis to be of the form \X,q,p), our task rcduces to specifying 
orthonormal bases for Qf and V\. We will call these bases Qf and Pa, respectively. 

We will choose Qa anc ^ ^ to both be a type of basis known as a subgroup-adapted basis. In 
Section 7.1.1 we describe the general theory of subgroup-adapted bases, and in Section 7.1.2, we will 
describe subgroup-adapted bases for Qf and V\. As wc will later see, these bases have a recursive 
structure that is naturally related to the structure of the algorithms that work with them. Here we 
will show how the bascs can bc stored on a quantum computer with a small amount of padding, and 
later in this chapter we will show how the subgroup-adapted bases described here enable efficicnt 
implcmcntations of Clebsch-Gordan and Schur duality transforms. 

7.1.1 Subgroup Adapted Bases 

First we review the bàsic idea of a subgroup adapted basis. We assume that all groups we talk about 
are finite or compact Lie groups. Suppose (r, V) is an irrep of a group G and H is a proper subgroup 
of G. We will construct a basis for V via the representations of H. 

Begin by restricting the input of r to H to obtain a representation of H, which we call (r\jj, V[ H ). 
Unlike V, the ií-representation VI H may be reducible. In fact, if we let (r^, V^) denote the irreps of 
H, then V[ H will decompose under the action of H as 

VÍh^©^®C*« (7.1) 

or equivalently, r|jj decomposes as 

T(h)=T\ H (h)2i@T' a (h)®I na (7.2) 
ael·l 

where H runs over a completo set of incquivalcnt irreps of H and n a is the branching multiplicity of 
the irrep labeled by a. Note that since r is a unitary representation, the subspaces corresponding to 
diffcrent irreps of H arc orthogonal. Thus, the problcm of finding an orthonormal basis for V now 
rcduces to the problcm of (1) finding an orthonormal basis for each irrep of H, V' a and (2) finding 
orthonormal bascs for the multiplicity spaces C™° . The case when all the n a are either or 1 is known 
as multiplicity-free branching. When this oceurs, we only need to determine which irreps oceur in the 
decomposition of V, and find bases for them. 

Now consider a group G along with a tower of subgroups G = G\ D G2 D ■ ■ ■ D Gk-i DGt = {e} 
where {e} is the trivial subgroup consisting of only the identity clement. For each Gi, denote its 
irreps by V*, for a € Gi. Any irrep V£ of G = G\ decomposes under restriction to G2 into G2- 
irreps: say that V^ appears n aitOÍ2 times. We can then look at these irreps of G2, consider their 
restriction to G3 and decompose them into different irreps of G3. Carrying on in such a manner 
down this tower of subgroups will yield a labeling for subspaces corresponding to each of these 
restrictions. Moreover, if we choose orthonormal bases for the multiplicity spaces, this will inducc 
an orthonormal basis for G. This basis is known as a subgroup-adapted basis and basis vectors have 
the form |a2,Tn2, CI3, 7713, . . . , a/j,Tnfe), where \rrii) is a basis vector for the (nc^^-dimcnsional) 
multiplicity space of V^. in V^T^ ■ 

If the branching for each G^+i C Gi is multiplicity-free, then we say that the tower of subgroups is 
canonical. In this case, the subgroup adapted basis takes the particularly simple form of |ct2, • ■ ■ , ctk), 
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where each ai E G,; and a^+i appears in the decomposition of V ai l Gi+1 . Often we include the original 
irrep label a = ol\ as well: |ai, o,ii ■ ■ ■ > a k)- This means that there exists a basis whose vectors 
are completely dctcrmined (up to an arbitrary choice of phase) by which irreps of G\, . . . , Gk they 
transform according to. Notice that a basis for the irrep V a docs not consist of all possible irrep 
labels ai, but instead only those which can appear under the restriction which defines the basis. 

The simple recursive structure of subgroup adapted bases makes them well-suited to performing 
cxplicit computations. Thus, for example, subgroup adapted bases play a major role in efficicnt 
quantum circuits for the Fourier transform over many nonabelian groups[MRR04]. 



7.1.2 



Explícit orthonormal bases for Qf and V\ 



In this section we describe canonical towers of subgroups for Ud and S n , which give rise to subgroup- 
adapted bases for the irreps Qf and V\. Thesc bases go by many names: for Ud (and othcr Lie 
groups) the basis is called the Gel'fand-Zetlin basis (following [GZ50]) and we denote it by Qf, while 
for S n it is called the Young-Yamanouchi basis, or somctimes Young's orthogonal basis (sec [JK81] for 
a good review of its properties) and is denoted P\ . The constructions and corrcsponding branching 
rules are quite simple, but for proofs we again refer the reader to [GW98]. 

The Gel'fand-Zetlin basis for Qf: For Ud, it turns out that the chain of subgroups {1} = Uq C 
IÀ\ C ... C Ud-i C Ud is a canonical tower. For c < d, the subgroup U c is embedded in Ud by 
U c := {U E Ud : U\i) = \i) for i = c + 1, . . . , d}. In other words, it corrcsponds to matrices of the 
form 



U © I d -c 



u 








Id-c 



(7.3) 



where U is a c x c unitary matrix. 

Since the branching from Ud to Ud-i is multiplicity-free, we obtain a subgroup-adapted basis Qf, 
which is known as the Gel'fand-Zetlin (GZ) basis. Our only free choice in a GZ basis is the initial 
choice of basis |1), . . . , \d) for C d which determines the canonical tower of subgroups U\ C . . . C Ud- 
Once we have chosen this basis, specifying Qf reduces to knowing which irreps Q^ _1 appear in the 
decomposition of Q%u d -i • R- eca U that the irreps of Ud are labeled by elements of Td, n with n arbitrary. 
This set can bc denoted by Z d ++ := U n I d , n = {A G Z d : X 1 > . . . > \ d > 0}. For fi E Z^ 1 , A G Z| + , 
we say that /i interlaces A and write /i ^ A whenever Ai > fj,% > A2 . . . > Ad_i > [id-i ^ Ad. In terms 
of Young diagrams, this means that fi is a vàlid partition (i.e. a nonnegativc, noninercasing sequence) 
obtained from removing zero or one boxes from each column of A. For example, if A = (4, 3, 1, 1) 
(as in Eq. (5.22)), then /1 ^ A can be obtained by removing any subset of the marked boxes below, 
although if the box marked * on the second line is removed, then the other marked box on the line 
must also be removed. 









X 




* 


X 




X 









Thus a basis vector in Qf corrcsponds to a sequence of partitions q = (qd, . 



(7.4) 

. , Çi) such that qd = A, 



-< 



<7i ^ 12 
choosing d 



■ ^ Qd and qj E for j = 1, . . . , d. Again using A = (4, 3, 1, 1) as an example, and 
5 (any d > 4 is possible), we might have the sequence 



>- 



(7.5) 



<u 



<h 
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Observe that it is possible in some steps not to remove any boxes, as long as qj has no more than j 
rows. 

In order to work with the Gcl'fand-Zetlin basis vectors on a quantum computer, wc will need an 
cfncicnt way to write them down. Typically, we think of d as constant and express our resource use 
in terms of n. Then an element of Id,n can be expressed with d\og{n + 1) bits. since it consists of 
d integers between and n. (This is a crude upper bound on \Td,n\ = ("^Y 1 )' ^ut ^ or con stant 
d it is good enough for our purposes.) A Gel'fand-Zetlin basis vector then requires no more than 
d 2 log(n+l) bits, since it can be expressed as d partitions of integers no greater than n into < d parts. 
(Here we assume that all partitions have arisen from a decomposition of (C d )®", so that no Young 
diagram has more than n boxes.) Unless otherwise specified, our algorithms will use this encoding of 
the GZ basis vectors. 

It is also possible to express GZ basis vectors in a more visually appealing way by writing numbers 
in the boxes of a Young diagram. If qx ^ . . . ^ q^ is a chain of partitions, then we write the number 
j in each box contained in qj but not qj-i (with qo = (0)). For example, the sequence in Eq. (7.5) 
would be denoted 



1 


1 


2 


■5 


2 


3 


3 




3 








5 









(7.6) 



Equivalently, any method of hlling a Young diagram with numbers from 1, . . . ,d corresponds to a 
vàlid chain of irreps as long as the numbers are nondecreasing from left to right and are strictly 
increasing from top to bottom. This gives another way of encoding a GZ basis vector; this time using 
nlogd bits. (In fact, we have an exact formula for dim Qf (Eq. (6.12)) and later in this section we 
will give an algorithm for efficiently encoding a GZ basis vector in the optimal [logdimQ^] qubits. 
However, this is not necessary for most applications.) 

Example: irreps ofUi'- To ground the above discussion in an example more familiar to physicists, 
we show how the GZ basis for U2 irreps corresponds to states of definite angular momentum along 
one axis. An irrep of Ui is labeled by two integers (Ai, A2) such that Ai + A2 = n and Ai > A2 > 0. A 
GZ basis vector for Q\ has A2 + m l's in the first row, followed by Ai — (A2 + m) 2's in the first row 
and A2 2's in the second row, where m ranges from to Ai — A2. This arrangement is necessary to 
satisfy the constraint that numbers are strictly increasing from top to bottom and are nondecreasing 
from left to right. Since the GZ basis vectors are completely specified by m, we can label the vector 
[(Ai, A2); (A2 + m)) G Q\ simply by |m). For example, A = (9, 4) and m = 2 would look likc 



1 


1 


1 


1 


1 


1 


2 


2 


2 


2 


2 


2 


2 





Now observe that dim Q\ = \\ — A2 + 1, a fact which is consistent with having angular momentum 
J = (Ai — A2V2. We claim that m corresponds to the Z component of angular momentum (specifically, 
the Z component of angular momentum is m — J = m — (Xi — X2) /2) . To see this, first note that U\ 
acts on a GZ basis vector \m) according to the representation x — > x X2+m , for x € U\\ equivalently 
q? (( ?)) = x 2+ ™|m). Since cR{yl2)\m) = y n \m) = y Xl+X2 \m), we can find the action of 
e iBa- _ 1^ 0) ^o^ j on this by combiniíig the above arguments to find that 

ql(e i9,T ')\m) = e 2i«(A a +m) e -iíi(Ai+A 3 ) | m ) = e 2i«(m-J)| m ^ Thug wQ obtain thc dcsircd action of a Z 
rotation on a particle with total angular momentum J and ÍT-component of angular momentum m. 

Example: The defining irrep ojlÀd'- The simplest nontrivial irrep of Ud is its action on C d . This 
corresponds to the partition (1), so we say that (q^), 2(i)) i s the defining irrep oïUd with Qf^ = C d 
and •)([/) = U. Let |1), . . . , \d) be an orthonormal basis for C d corresponding to the canonical 
tower of subgroups U\ C • • • C Ud- It turns out that this is already a GZ basis. To see this, note 
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triat Q( í 1 )lw d _ 1 — Qç'o) 1 © ^(ï) 1, ^ n * s ^ s because |d) generates Q^ 1 , a trivial irrep of Ud-i; and 
|1), . . . , \d— 1) gcnerate Qtj 1 , a defining irrep of Ud-i- Another way to say tiris is that |j) is actcd on 
according to the trivial irrep of Ui, . . . ,Uj-\ an d according to the defining irrep oíLij, . . . ,Ud- Thus 
\j) corresponds to the chain of partitions {(O)- 7 " 1 , (1) · ?+1 }. We will return to this example several 
times in the rest of the chapter. 

The Young- Yamanouchi basis for V\ ■ The situation for S n is quite similar. Our chain of subgroups 
is {e} = Si C S2 C . . . C S n , where for m < n we define S m C S n to be the permutations in S n which 
leave the last n — m elements fixed. For example, if n = 3, thcn S$ = {e, (12), (23), (13), (123), (321)}, 
1S2 = {e, (12)}, and S\ = {e}. Recali that the irreps of iS„ can be labeled by X n = the partitions 
of n into < n parts. 

Again, the branching from S n to S n -i is multiplicity-free, so to determino an orthonormal basis 
P\ for the space V\ we need only know which irreps oceur in the decomposition of V)Xs _i- ^ turns 
out that the branching rulc is given by finding all ways to remove one box from A whilc lcaving a vàlid 
partition. Dcnote the set of such partitions by A — □. Formally, A — □ := I n n {A — ej : j = 1, . . . , n}, 
where we recali that ej is the unit vector in 1 n with a one in the j th position and zeroes elsewhere. 
Thus, the general branching rule is 

PaU.-/^ -1 ?V ( 7 - 8 ) 
fj,ex—D 

For example, if A = (3, 2, 1), we might have the chain of partitions: 




(7.9) 



71 = 6 71 = 5 71 = 4 71 = 3 11 = 2 71 =1 



Again, we can concisely label this chain by writing the number j in the box that is removed when 
restricting from Sj to Sj—i. The above example would then be 



1 


3 


G 


2 


4 




5 







(7.10) 



Note that the vàlid methods of filling a Young diagram are slightly different than for the Ud case. Now 
we use each integer in 1 , . . . , n exactly once such that the numbers are increasing from left to right 
and from top to bottom. (The same filling schcme appeared in the description of Young's natural 
representation in Section 5.3.1, but the rcsulting basis states are of coursc quite different.) 

This gives rise to a straightforward, but inefficient, method of writing an element of Pa using logn! 
bits. However, for applications such as data compression[HM02a, HM02b] we will need an encoding 
which gives us closer to the optimal \ogP\ bits. First recali that Eq. (6.13) gives an exact (and 
cfhcicntly computable) expression for \P\\ = dim'PA· Now we would like to cfficicntly and reversibly 
map an element of P\ (thought of as a chain of partitions p — {jp n = A, . . . ,pi = (1)) G P\, with 
pj G Pj+i — D) to an integer in [\P\\] := {1, . . . , \P\\}- We will construct this bijection f n : P\ — > [|Pa|] 
by defining an ordering on Pa and setting f n (p) '■= \{p' G P\ : p' < p}\. First fix an arbitrary, but 
easily computable, (total) ordering on partitions in X n for each ti; for example, lexicographieal order. 
This induces an ordering on Pa if we rank a basis vector p G Pa first according to p n -i, using the 
order on partitions we have chosen, then according to p n -2 and so on. We skip p n , since it is always 
equal to A. In other words, for p,p' G P\, p > p 1 if p n -i > P n -i or Pn-i = p'n-i an d Pn-2 > Pn~2 or 
Pn-i = Pn-i, Pn-2 = p' n - 2 and p n - 3 > p' n _ 3 , and so on. Thus /„ : P\ -> [\P\\] can be easily verified 
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to be 

n 

fn(p) = fn(pi,...,Pn)~l + J2 E dÍm7 V (' 

/i<Pfc-l 

Thus /„ is an injectivc map from P\ to [|Pa|]- Moreover, since there are 0(n 2 ) terms in Eq. (7.11) 
and Eq. (6.13) gives an efficient way to calculatc cach \P\\, this mapping can bc pcrformcd in timc 
polynomial in n. 

Of course, the same techniques could be used to efficiently write an element of Qf in flog 
bits, but unless d is large this usually is not necessary. 

7.2 Constructing the Schur transform from a series of 
Clebsch-Gordan transforms 

In this section, we will show how the Schur transform on (C d )® n can be reduced to a series of CG 
transforms onUd- The argument is dividcd into two parts. First, we give the theoretical underpinnings 
in Section 7.2.1 by using Schur duality to rclatc the Ud CG transform to branching in S n . Then we 
show how the actual algorithm works in Section 7.2.2. 



7.2.1 Branching rules and Clebsch-Gordan series for Ud 

Recali that CG transform for Ud is given by 

Q>2^ Qa®C a/ -. (7.12) 

\&\ + 

For now, we will work with Littlewood-Richardson coefhcients M^ v rather than the more structured 
space Hom(<2^, Q d <X> Q d ,) Ud . The partitions A appearing on the RHS of Eq. (7.12) are sometimes 
known as the Clebsch-Gordan series. In this section, we will show (following [GW98]) how the Ud 
Clebsch-Gordan series is related to the behavior of S n irreps under restriction. 

For integers k,n with 1 < k < n, embed Sk x S n -k as a subgroup of S n in the natural way; as 
permutations that leave the sets {1, . . . , k} and {k + 1, . . . , n} invariant. The irreps of iS^ x are 
r P l· ié)V u , where and v E l n -k- 

Under restriction to Sk x S n -k C S„ , the <S„-irrep V\ decomposes as 

^ s * x |-*0 q Vfi ® Vv<g)C NÏ % ( 7 . 13 ) 

for some multiplicities N* v (possibly zero). 
Claim 7.1. M*„ = N*,. 

As a corollary, M^ v is only nonzero when |A| = \fi\ + \ v\. 

Proof. Consider the action of Sk X S n -k X Ud on (C d )® n . On the one hand, Eq. (7.13) gives 

(C d )®» S "£f d Vxé)Q d x SkX ^ k Vpè>V v èQ&®C N è". (7.14) 

Xeld,n tJ-£ld,k,v£ld,n-k 

A6Z d ,„ 
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On the other hand, we can apply Eq. (7.12) to obtain 

5 fc x5„_ 



(C 



d \0n 



d\<g>n—k 



{V^Q'lMVumí) ^ V^èV.èQfi 



Equating Eqns. (7.14) and (7.15) proves the desired equality. 



(7.15) 
□ 



This mcans that the branching rules of S n dctcrminc the CG series for hid* In particular, suppose 
k = n—l. Then Si is the trivial group, so restricting to x Si is equivalent to simply rcstricting to 
iS„_i. According to the branching rule stated in Eq. (7.8), this means that M x ^ is onc if A G A' — □ 
and zero otherwise. In other words, for the case when one irrep is the defining irrep, the CG series is 



QÍ®QU= Qí- 

A'eA+D 

Here A + □ denotes the set of vàlid Young diagrams obtained by adding one box to A. 
For example if A = (3, 2, 1) then 

u 3 



(7.16) 



2?3,2,1) ® 2(1) — 2(4,2,1) © Qi3,3,l) © 2(^2,2) 



or in Young diagram form 



1 r— 1% 



(7.17) 



(7.18) 



Note that if we had d > 3, then the partition (3, 2, 1, 1) would also appcar. 

We now seek to define the CG transform as a quantum circuit. We specialize to the case where 
one of the input irreps is the defining irrep, but allow the other irrep to be specified by a quantum 
input. The rcsulting CG transform is defined as: 



Ucg= £ lAXAI^l^ 



(7.19) 



This takes as input a state of the form |A)|ç)|i), for A G \q) G Qf and i S [d]. The output is a 

superposition over vectors |A)|A')|g'), where A' = A + ej G j G [d] and \q') G Qy. Equivalcntly, 

we could output |A)|j)|ç') or |j)|A')|g'), since (A, A'), (X,j) and (A',j) are all trivially related via 
reversible classical circuits. 

To better understand the input space of Uqg, we introduce the model representation Q d := 
©Aez d + 2a> with corresponding matrix c[f(U) = J2 X |A)(A| (8>q^(C/). The model representation (also 

sometimes called the Schwínger representation) is infinite dimensional and contains each irrep once.^ 
Its basis vectors are of the form |A, q) for A G Z+ + and \q) G Qf. Since Q d is infinite-dimensional, we 
cannot store it on a quantum computer and in this thesis work only with representations Qf with 
|A| < n; nevertheless Q d is a useful abstraction. 

Thus Uqg dccomposes Qf Cg> into irreps. There are two important things to notice about this 
version of the CG transform. First is that it operatcs simultaneously on different input irreps. Second 
is that different input irreps must remain orthogonal, so in order to to maintain unitarity í7cg needs 



*We can similarly obtain the CG series for S n by studying the branching from U ( i 1 d^ to ®l4d 2 ■ This is a useful 
tool for studying the relation between spectra of a bipartite density matrix p AB and of the reduced density matrices 
p A and p s [CM04, Kly04], 

tBy contrast, L 2 (Ud), which we will not use, contains with multiplicity dim Q^. 
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to keep the information of which irrep we started with. However, 
requires only storing some j € [d]. Thus, Uqg is a map from Q° 



since X = X + ej , this information 



C d to Q« 



where the 



the input is the defining representation and the C d in the output tracks which irrep we started with. 




IA') 



— \q) 

Figure 7-1: Schematic of the Clebsch-Gordan transform. Equivalently, we could replace either the A 
output or the A' output with j. 



7.2.2 Constructing the Schur Transform from Clebsch-Gordan Transforms 

We now describe how to construct the Schur transform out of a series of Clebsch-Gordan transforms. 
Suppose we start with an input vector \ii, . . . , i n ) € (C )® n , corrcsponding to the %-representation 
(Q^))®"- According to Schur duality (Eq. (5.16)), to pcrform the Schur transform it suffices to 
decompose (Q^)®" into %-irreps. This is because Schur duality means that the multiplicity space 
of Qf must be isomorphic to V\. In othcr words. if we show that 

(Qd {1) fn U é QÍ®V' X , (7.20) 

S„ 

then we must have V' x = V\ when A € Jd,n and V' x = {0} othcrwise. 

To pcrform the %-ürcp dccomposition of Eq. (7.20), we simply combine each of . . . , \i n ) using 
the CG transform, onc at a time. We start by inputting \X^') = |(1)), \ii) and |i 2 ) into Uqg which 
outputs IA' 1 ') and a superposition of different vàlues of \X^) and |g2). Here A' 2 ' can be cithcr (2,0) 
or (1,1) and \q2) <E Continuing, we apply Ucg to lA*- 2 ))^)!^), and output a superposition of 

vectors of the form |A ( - 2 - ) )|A( 3 ' ) )|q , 3), with A^ 3 ' G 2^,3 and | Ç3) € Q^o;- Each time we are combining 
an arbitrary irrep and an associated basis vector \qk) £ Q xm , together with a vector from the 
defining irrep This is repeated for k = 1, . . . ,n — 1 and the resulting circuit is depicted in 

Fig. 7-2. 

Finally, we are left with a superposition of states of the form IA 1 - 1 - 1 , . . . ,X^ n ')\q n ), where \q n ) £ 
Qw n) , A( fc ) g I d k and each A^ fc ^ is obtained by adding a singlc box to A^'" 1 ); i.c. A' fe) = A^" 1 ' + e jk 
for some jk € [d]. If we definc A = A'™' and \q) = \q n ), then wc have the decomposition of Eq. (7.20) 
with V' x spanned by the vectors \X^· 1 \ . . . , A'"" 1 - 1 ) satisfying the constraints described above. But this 
is precisely the Young-Yamanouchi basis Pa that we have defined in Section 7.1! Since the first k 
qudits transform under Ud according to Q x(k) , Schur duality implies that they also transform under 
S n according to V X (k) ■ Thus we set \p) = \X^', . . . , A^™ -1 )) (optionally compressing to [log|P\|l 
qubits using the tcchniques described in the last section) and obtain the desired |A)|ç)|p). As a check 
on this result, note that each X^ k ' is invariant under Q(Ud) since U® n acts on the first k qubits simply 
as U m . 

If we choose not to perform the poly(n) steps to optimally compress (A^ 1 ', . . . , A^™ -1 )), we could 
instead have our circuit output the equivalent . . . , j„_i), which requires only nlogd qubits and 
asymptotically no extra running time. 
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|A(«- 2 )) 



Kn) 



Ucg ]-\X (n} ) 

\ln) 



Figurc 7-2: Cascading Clcbsch-Gordan transforms to produce the Schur transform. Not shown are 
any ancilla inputs to the Clcbsch-Gordan transforms. The structure of inputs and outputs of the 
Clebsch-Gordan transforms are the same as in Fig. 7-1. 



We can now appreciate the similarity between the Ud CG "add a box" prescription and the 
S n -i C S n branching rulc of "remove a box." Schur duality implies that the representations Qf, 
that are obtained by decomposing (g) are the same as the 5„-irreps V\> that include V\ when 
restricted to S n —\. 

Dcfine Tcg(", d, e) to be the time complexity (in terms of number of gates) of performing a single 
Ud CG transform to accuracy e on Young diagrams with < n boxes. Then the total complexity for 
the Schur transform is n ■ (TcG(n,d,e/n) + 0(1)), possibly plus a poly(n) factor for compressing 
the V\ register to [logdim'PAl qubits (as is required for applications such as data compression and 
cntanglcment concentration, cf. Section 6.3). In the next section wc will show that TçQ{n,d, e) 
is poly(logn, d, log 1/e), but first we give a step-by-step description of the algorithm for the Schur 
transform. 



Algorithm: Schur transform (plus optional compression) 

Inputs: (1) Classical registers d and n. (2) An n qudit quantum register . . . , i n ). 
Outputs: Quantum registers |A)|<jr)|p), with À e Id,n, Q S Q\ and p G P\. 
Runtime: n ■ (Tcg(^, d, e/n) + 0(1)) to achieve accuracy e. 

(Optionally plus poly(n) to compress the T'x register to [logdimPAl qubits.) 
Procedure: 

1. Initialize |A (1 )) := |(1)) and = \h). 

2. For k = 1, . . .,n- 1: 

3. Apply Ucg to \X^)\q k )\i k+1 ) to obtain output |j fc )|A( fc + 1 ))| (7fc+1 ), whcrc A( fc+1 > - + e Jk . 

4. Output |A) := |AW), \q) := \q n ) and \p) := \j u . . . ,i„_ 1 ). 

5. (Optionally use Eq. (7.11) to reversibly map . . . ,j n -x) to an integer p e [diniT^A]-) 



This algorithm will be made efficient in the next section, where wc cfficiently construct the CG 
transform for Ud, proving that Tcg(w, d, e) = poly(logn, d, log 1/e). 



148 



7.3. EFFICIENT CIRCUITS FOR THE CLEBSCH-GORDAN TRANSFORM 



7.3 EfRcient circuits for the Clebsch-Gordan transform 

We now turn to the actual construction of the circuit for the Clebsch-Gordan transform described in 
Section 7.2.1. To get a feel for the what will be necessary, we start by giving a circuit for the CG 
transform that is cmcient when d is constant; i.e. it has complexity n°( d >, which is poly(n) for any 
constant value of d. 

First recali that dim Qf < (n + í) d . Thus, controlled on A, we want to construct a 
unitary transform on a Z?-dimcnsional system for D = ma,x\ e x d „ dim Qf = poly(ra). There 
are classical algorithms[Lou70] to computo matrix elements of Ucg to an accuracy e\ in timc 
poly(D) poly log(l/ei). Once we have calculatcd all the relevant matrix elements (of which there 
are only polynomially many), we can (again in time poly(Z?) poly log(l/e)) decompose Ucg uito 
D 2 poly log(D) elementary one and two-qubit operations[SBM04, RZBB94, Bar95, NCOO]. These can 
in turn be approximated to accuracy e 2 by products of unitary operators from a fixed finite set (such 
as Clifford operators and a ir/8 rotation) with a furthcr overhead of poly log(l/e2)[DN05, KSV02]. 
We can either assume the relevant classical computations (such as decomposing the Dx D matrix into 
elementary gates) are performed coherently on a quantum computer, or as part of a polynomial-time 
classical Turing machine which outputs the quantum circuit. In any case, the total complexity is 
poly(n, log 1/e) if the dcsircd final accuracy is e and d is held constant. 

The goal of this section is to reduce this running time to poly(n, d, log(l/e)); in fact, we will 
achieve circuits of sizc poly(eí, logn, log(l/e)). To do so, we will reduce the Ud CG transform to two 
components; first, a Ud-i CG transform, and second, a d x d unitary matrix whose entries can be 
computed classically in poly(eí, logn, 1/e) steps. After computing all d 2 entries, the second component 
can then be implemented with poly (d, log 1/e) gates according to the above arguments. 

This reduction from the Ud CG transform to the Ud-i CG transform is a special case of the 
Wigncr-Eckart Theorem, which we review in Section 7.3.1. Then, following [BL68, Lou70], we use 
the Wigner-Eckart Theorem to givc an efficient recursive construction for Ucg in Section 7.3.2. 
Putting everything together, we obtain a quantum circuit for the Schur transform that is aceurate 
to within e and runs in time n • poly(logn, d, log 1/e), optionally plus an additional poly(n) time to 
compress the \p) register. 

7.3.1 The Wigner-Eckart Theorem and Clebsch-Gordan transform 

In this section, we introduce the concept of an irreducible tensor operator, which we use to state 
and prové the Wigner-Eckart Theorem. Here we will find that the CG transform is a key part of the 
Wigner-Eckart Theorem, while in the next section we will turn this around and use the Wigner-Eckart 
Theorem to give a recursive decomposition of the CG transform. 

Supposc (n, Vi) and (r 2 , V2) are representations oïUd- Recali that Hom(Vi, V2) is a representation 
oíU d under the map T -> v 2 {U)Tv 1 {U)~ 1 for T G Hom^.Va). UT= {Ti,T 2 ,...} C Hom^.Va) is 
a basis for a %-invariant subspacc of Hom(Vi , V2) , then we call T a tensor operator. Note that a tensor 
operator T is a collection of operators {Tí} indexed by i, just as a tensor (or vector) is a colleetion of 
sealars labeled by some index. For example, the Pauli matrices {o~ x , <r y , <r z } C Hom(C 2 , C 2 ) comprisc 
a tensor operator, since conjugation by U 2 preserves the subspace that they span. 

Sincc Hom(Vi, V2) is a representation of Ud, it can be decomposed into irreps. If T is a basis for 
one of these irreps, then we call it an irreducible tensor operator. For example, the Pauli matrices 
mentioned above comprise an irreducible tensor operator, corresponding to the three-dimensional 
irrep Qf 2) . Formally, we say that T" = {T^} qu eQt C Hom(Vi,V r 2 ) is an irreducible tensor operator 
(corresponding to the irrep Q d ) if for all U £ Ud we have 



v 2 (U)T^v 1 (U)- 1 = (M(U)\q„)T», 



(7.21) 
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Now assume that V\ and V% are irreducible (say V\ = Q d and V2 = Q^), since if they are not, 
we could always decompose Hom(Vi, V2) into a direct sum of homomorphisms from an irrep in Vi 
to an irrcp in V2. We can decompose Hom(Q^,<2^) into irrcps using Eq. (5.3) and the identity 
Hom(A, B) = A* (g) B as follows: 

Kam{Q%Q{) U à ® Rom(Q d u ,Rom(Q d , fi*))"' 



ve 



Hi 



Qj, g) Hom(Q^, (Q£)* 2> Q 

Qt®({Qir®{Qír®Q 



f) ud 



Q*<8> Hom(Q> fií) 



(7.22) 



Now consider a particular irreducible tensor operator T" C Hom(Q^, Q^) with components 
where q v rangcs over Qf,. We can define a linear operator T : ® Q„ — > Q d by lctting 

f| ÍM )| 9v ):=T^| 9íl ) (7.23) 

for all g M € Q^q» € Qj5 and extending it to the rest of g) <2j5 by linearity. By construction, 

we claim that in addition T is invariant undcr the action of Ud] i.e. 
that it lics in Hom(Q^ <g> Q^, Qf) Ud . To see this, apply Eqns. (7.21) and (7.23) to show that for any 
U G W d , G QjJ and q v G Q^, we have 

qf^ffq^^-^q^C/)- 1 ]!^)!^) = £ (ç^qf (U)- l \q v )qÍ(U)T^ q^U) 

= (Mm^HM^MT^) (7-24) 

= 2£|g íl )=í|g íl )|g 1/ >. 



Now, fix an orthonormal basis for Rom(Qf l ® Q^, Q^) Wd and call it 
in this basis as 

T = ^ T a -a, 

where the T Q are scalars. Thus 

{<l\\ T qM») = ^(SaM^*/)- (7.26) 

This last cxprcssion (çaM^, q v ) bears a striking resemblance to the CG transform. Indecd, note 
that the multiplicity space Hom(Q^, ® Q d ,) Ud from Eq. (5.4) is the dual of Hom(Q^ <g> Qf,, Q d x ) Ud 
(which contains a), mcaning that we can map between the two by taking the transposo. In fact, 
taking the conjugate transpose of Eq. (5.5) gives (q\\ct = (q\, o<\Uqq . Thus 



. Thcn we can expand T 
(7.25) 



(íaIqI^,^) = (^,a t |^cGl ( ÏM>^)· 



(7.27) 
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The arguments in the last few paragraphs constitute a proof of the Wigner-Eckart theorem[Mes62], 
which is stated as follows: 



Theorem 7.2 (Wigner-Eckart). For any irreducible tensor operator T" = {Tq u } qu çQd C 

Hom(Q^,Q^) ; there exist T Q e C for each a <G M* v such that for all {q^) e Q^, \q v ) e Qf, and 
\<lx) 6 Qí: 

{Qx\TZM= £ ^(^,^1^1^,^). (7.28) 

Thus, the action of tensor operators can be related to a component T a that is invariant undcr 
l·ld and a component that is equivalent to the CG transform. We will use this in the next section to 
derive an efficient quantum circuit for the CG transform. 



7.3.2 A recursive construction of the Clebsch-Gordan transform 

In this section we show how the Ud CG transform (which here we call Uq\) can be efficiently reduced 
to the Ud-i CG transform (which we call Uqq Our strategy, following [BL68], will be to express 
Uçq in terms of Ud-i tensor operators and then use the Wigner-Eckart Theorem to express it in 
terms of Uqq. After we have explained this as a relation among operators, we describe a quantum 
circuit for Uqq that uses Uq G ^ as a subroutine. 

First, we express Uqq as a Ud tensor operator. For \i G \q) € Qf, and i <E [d], we can expand 

^cgIm)I<?)I*) as 

uW\n)\q)\i) = I//) £ 2 ^L'^ + ^ll')- ( 7 - 29 ) 
je[d\ s.t. 9 'eQ^ +e . 

for some cocfhcicnts C^' 3 q , €E C. Now define operators r/ IJ : — > QjJ_|_ e . by 

v y cr^iç')(9i, (7.30) 



96QÍ g'eQ*. 



so that decomposes as 



t&>l«>|i> = M E iM + ^ïT'k). (7.31) 
je[d] s.t. 



Thus Í7^ can be understood in terms of the maps T i , which arc irreducible tensor operators in 
Hom(Q^, Qf t+e .) corresponding to the irrep Qfiy (This is unlikc the notation of the last section in 
which the superseript denoted the irrep corresponding to the tensor operator.) 

The plan for the rest of the section is to decompose the T?' 3 operators under the action of Ud-i, 
so that we can apply the Wigner-Eckart theorem. This involves decomposing thrcc diffcrent Ud irreps 
into Ud-i irreps: the input space Q^, the output space Qf l + ej and the space Qh) corresponding to 
the subscript i. Once we have done so, the Wigner-Eckart Theorem gives an expression for T?' 3 
(and hence for Uqq) in terms of Uq G ^ and a small number of coefficients, known as reduced Wigner 
coefficients. These coefficients can be readily calculated, and in the next section we cite a formula 
from [BL68] for doing so. 

First, we examine the decomposition of Qm> the %-irrep according to which the 
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form. Recali that Q d {1) Sí Q^' 1 © Q?" 1 . In ter ms of the tensor operator we have defined, this 
means that T^' 3 is an irreducible Ud-i tensor operator corresponding to the trivial irrep Q^ 1 an( l 
{T{ l ' J , . . . , ÏjH} comprise an irreducible tensor operator corresponding to the defining irrep 

«(i) ■ 

Next, we would like to decompose Hom(<2^, Q^ +e .) into maps between irreps of Ud-\- This is 
slightly morc complicated, but can be derived from the lAd-i C Ud branching rulc introduced in 

Section 7.1.2. Recali that Sí 0^ Q^r 1 , and similarly Q^ +£j Sí 0^--^. Q^» . This is 
the moment that we anticipated in Section 7.1.2 when we chose our set of basis vectors to respect 
these decompositions. As a rcsult, a vector \q) £ Q d can be expanded as q = (qd-i, Qd-2, ■ ■ ■ , Qi) = 
(m'j 9(68-2)) witn Çd-i = // G // 3 A* and l?(c8-2)) = kd-2, ■ • • , <Zi) £ Q^r 1 . In other words. wc 

will separate vectors in into a irrep label // G and a basis vector from Q^T 1 . 

This describes how to decompose the spaces QjJ and Qf l+e . ■ To extend this to decomposition 
of Hom(Q^, Q^ +e3 ), we use the canonical isomorphism Hom(0 2 . A x ,Ç$ y B y ) = (§) x y ïlom(A Xl B y ), 
which holds for any sets of vector spaces {A x } and {By}. Thus 

Hom^jQ^A' 1 © © HomCQjIrSe^ 1 ). (7.32a) 

Sometimes we will find it convenient to denote the Q^T 1 subspace of Q c ^ by 2^7 1 C Q^, so that 
Eq. (7.32a) becomes 

Hom(Q£, QUef^ © © Hom^r 1 C Q% Q^ 1 C QJ^). (7.32b) 

According to Eq. (7.32) (either version), we can decompose T í ' tJ as 

Tt J =J2 E W'){A®T?^'> t1 "- (7-33) 

Here Tf J> '' íl " G Hom(Q í d í 7 1 C Q*, 2^7 1 C Q^ +ej ) and we have implicitly decomposed \q) G QjJ into 
lM')k(d-2)>- 

The next step is to decompose the representions in Eq. (7.32) into irreducible components. In 
fact, we are not interested in the entire space Hom(Q^7 1 , Q^T, 1 ), but only the part that is equivalent 

to Q^ 1 or Q^J) 1 , depending on whether i £ [d — 1] or i = d (since Tf' J,IÀ ,A1 transforms according 
to Q^ 1 if i G {l,...,d- 1} and according to Q^" 1 if i = d). This knowledge of how 7^»" 
transforms under Ud-i will give us two crucial simplifications: first, we can greatly reduce the range 
of /i" for which ,hl·1 ,A1 is nonzero, and second, we can apply the Wigner-Eckart theorem to describe 

T fJ,l*',l*" ^ termS ofU^. 

The simplest case is Q^q^ when i = d: according to Schur's Lemma the invariant component of 
Hom(Q^7 1 , Q^ 1 ) is zero if p! ^ /i" and consists of the matrices proportional to Igd-i if p! = //'. 

In other words T^'' 1 *" = unless // = in which casc T^>'^' : = ff^^' fi I Qd -i for some 

sealar T^' 3 '^ ,0 . (The final superseript will later be convenient when we want a single notation to 
encompass both the i = d and the i G {1, . . . , d — 1} cases.) 

The Q-fi) 1 case, which oceurs when i G {1, ...,d— 1}, is more interesting. We will simplify 
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the Jr' 41 operators (for i = 1, ...,d — 1) in two stages: first using the branching rules from 
Section 7.1.2 to reduce the number of nonzero terms and then by applying thc Wigncr-Eckart theorem 
to find an exact expression for thcm. Begin by rccalling from Eq. (7.22) that the multiplicity of Q^ 1 

in thc isotypic dccomposition of Hom {0^7 1 , Qp, 1 ) is given by dim Hom (Q^V 1 ® Sm 1 , Q^» 1 )"" -1 . 
According to the Ud-i CG "add a box" prescription (Eq. (7.16)), this is one if li' G fj," — □ and zero 
otherwise. Thus if i G [d — 1], thcn T^' 3 ' l·L ,fí is zero unlcss li" = // + e,y for some j' G [d — 1]. Since 

we need not consider all possible fx", we can definc T\ M,Al J := T^" 3 ^ ,l·L +Cj ' . This notation can bc 
rcadily extended to cover the case when i = d; define eo = 0, so that the only nonzero operators for 
i = d arc of thc form T^>'<° : = r^V.M' = f^^' fi I Qd -i. Thus, wc can rcplacc Eq. (7.33) with 

T t' j =J2 Ul/x' + ej v)( M '|®Tf í>V+ ^. (7.34) 

3'=0 

Now wc show how to apply the Wigner-Eckart theorem to the i G [d — 1] case. The operators 
j^ij'.M ü ma p Q^ 1 to Q^T^g and comprisc an irrcduciblc tensor operator corresponding to 

the irrep QÍT 1 , This means we can apply thc Wigncr-Eckart Theorem and since thc multiplicity 
of Q d 7}, in Q d 7 l <8> Qfn 1 is one, the sum over the multiplicity label a has only a single term. 
The theorem implies the existence of a set of sealars T^^-v >i such that for any \q) G Q d ^ 1 and 

^\T^' J \q)=t^'i\yi,^! + e r ^\U^ í y,q,i). (7.35) 

Somctimcs thc matrix clements of Uqg or T/'··'·'' arc called ïï%»ít coeficients and the T 1 ''- 14 ' -J 
are known as reduced Wigner coefficients. 

Let us now try to intèrpret these equations opcrationally. Eq. (7.31) reduces the hid CG transform 
to a % tensor operator, Eq. (7.34^ decomposes this tensor operator into d 2 different Ud-i tensor 
operators (wcighted by the T MJ,/i J cocfficicnts) and Eq. (7.35) turns this into a Ud-i CG transform 
followcd by a d x d unitary matrix. The coeficients for this matrix arc the T MJ:M J , which we will 
see in the next section can be efficiently computed by conditioning on \i and /i'. 

Now wc spell this recursion out in more detail. Suppose we wish to apply Uqq to = 
\fj)\l·i')\(l(d-2))\ï) i f° r some i G {1, . . . , d — 1}. Then Eq. (7.35) indicates that we should first apply 
[/çq : ' to |<7((j-2)) \i) to obtain output that is a superposition over states |// + ey) \j') ^ or 
j' G {1, . . . , d— 1} and \q'^ d _ 2 )) € Q d ^ e i ■ Thcn, controlled by /i and //, wc want to map thc (d — 1)- 
dimcnsional register into the eï-dimensional \j) register, which will then teli us thc output irrep 
Q d l+ej ■ According to Eq. (7.35), the coefficients of this d x (d — 1) matrix are given by the reduced 

Wigner coefficients f^' j ^'' j ', so we will denote the overall matrix fj^ , := £) . ., f tó^'+V 

The resulting circuit is depicted in Fig. 7-3: a U d -\ CG transform is followed by the operator, 
which is defined to be 



fld] = E E^'" J \M®\P + *j)(l* , \®W + eïM + e3'\· (7-36) 

Then Fig. 7-4 shows how can bc expressed as a d x (d — 1) matrix that is controlled by 

fi and /i'. In fact, once we consider the i = d case in the next paragraph, we will find that Tjf^ , is 
actually a d x d unitary matrix. In the next section, we will then show how the individual reduced 

*The reason why fi' + e,-/ appears in thc superseript rather than fi' is that after applying , we want to keep a 
record of fi' + e / rather than of fi' . This is further illustrated in Fig. 7-4. 
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Wigner coefíicients j can be efficiently computed, so that ultimately , can be implemented 

in time poly (d, log 1/e) . 

Now we turn to the case of i = d. The circuit is much simpler, but we also need to explain 
how it works in coherent superposition with the i G [d — 1] case. Since i = d corresponds to the 
trivial representation of Ud— i, the U GG ^ operation is not performed. Instead, |//) and \qtd-2)) are 
left untouched and the \i) = \d) register is relabeled as a \f) = |0) register. We can combine this 
relabeling operation with U GG ^ in the i e [d — 1] case by defining 



fj[d-l] 



|0> <rf| <g> 



E 



m'Xm'I I ®i q ^ 



[d-i] 

CG ■ 



(7.37) 



><í-i 



This cnds up mapping i S {1, . . . , d} to j' £ {0, . . . , d — 1} while mapping Q^, 1 to Q l·l , +e ., . 
can intèrpret the sum on j' in the abovc definitions of and T 1 ^', as ranging over {0, . . . , d 



Now we 
-1}, so 



that , is a d x d unitary matrix. We thus obtain the circuit in Fig. 7-3 with the implementation 
of fW depicted in Fig. 7-4. 



\q) 




I?) 



Figure 7-3: The Ud CG transform, U GG , is decomposed into a Ud-i CG transform t/^ d G 1 ' (sec 
Eq. (7.37)) and a reduced Wigner operator T^ d \ In Fig. 7-4 we show how to reduce the reduced 
Wigner operator to a d x d matrix conditioned on [i and // + e,v . 



Im> — 
Im')- 



f[d] 



Im) 



- Im + e i> 

|A*' + e 3 v) 



l/i'> — e-I^H^ 

|A*' + e 3 v) 



— W + e 3 , 



Figure 7-4: The reduced Wigner transform can be expressed as a d x d rotation whose coeficients 
are controlled by /x and // + ey . 



We have now reduced the problem of performing the CG transform U GG to the problem of com- 
puting reduced Wigner coefhcients T^'- 7 '^ J . 

7.3.3 Efficient Circuit for the Reduced Wigner Operator 

The method of Biedcnharn and Louck[BL68] allows us to compute reduced Wigner coefíicients for 
the cases we are interested in. This will allow us to construct an efficient circuit to implement the 
controlled-T operator to accuracy e using an overhead which scales like poly(log?i, d, log(e -1 )). 



154 



7.3. EFFICIENT CIRCUITS FOR THE CLEBSCH-GORDAN TRANSFORM 



To compute T^'^ J , we first introduce the vectors [i := \i + Y^j=i(^ ~ j) e j an< ^ A 4 ' := A 4 ' + 
E^=i (d— 1 — j)ej. Also define S(j —j') to bc 1 if j > j' and —1 if j < f. Thcn according to Eq. (38) 
in Rcf [BL68], 



sü-f) 



n a g[d-i]\j(^-Mi)n t g[d]\,'(Mj'-^+ i ) 

n^idiuK-M'JIlts^-ijxytM^-K+l) 

n 3e[d -i] Xj fe-M , 3 ) 



if fe {i,...,d-i}. 

(7.38) 



s(j-d) 1 f '"-^^r -rr if i' = o. 



The elements of the partitions here are of size O(n), so the total computation necessary is 
poly(ii, logn). Now how do we implement the transform given this expression? 

As in the introduction to this section, note that any unitary gate of dimension d can be implc- 
mented using a number of two qubit gates polynomial in <i[RZBB94, Bar95, NC00]. The method 
of this construction is to take a unitary gate of dimension d with known matrix elements and then 
convert this into a series of unitary gates which act non-trivially only on two states. These two state 
gates can then be constructed using the methods described in [Bar95]. In order to modify this for our 
work, we calculate, to the specified aceuracy e, the elements of the T^l operator, conditional on the \i 
and \j! + ey inputs, perform the decomposition into two qubit gates as described in [RZBB94, Bar95] 
online, and thcn, conditional on this calculation perform the appropriatc controllcd two-qubit gates 
onto the space where will act. Finally this classical computation must bc undone to reset any 
garbage bits created during the classical computation. To produce an aceuracy e we need a classical 
computation of size poly(log(l/e)) since we can perform the appropriate controlled rotations with 
bitwise aceuracy. 

Putting everything together as depicted in figures 7-3 and 7-4 gives a poly(d, log n, log 1/e) algo- 
rithm to reduce Uqq to U GG 1 '. Naturally this can be applied d times to yield a poly(<i, logn, log 1/e) 
algorithm for Uçq. (We can end the recursion cither at d = 2, using the construction in [BCH04], or 
at d = 1, where the CG transform simply consists of the map /i — > /i + 1 for /i G Z, or even at d — 0, 
where the CG transform is complctcly trivial.) We summarize the CG algorithm as follows. 

Algorithm: Clebsch-Gordan transform 

Inputs: (1) Classical registers d and n. (2) Quantum registers |A) (in any superposition over 
differcnt A e 1d,n), \q) € Qf (expressed as a superposition of GZ basis elements) and \i) G C d . 
Outputs: (1) Quantum registers |A) (equal to the input), \j) G C d (satisfying A+e_, G Id n +i) 
and\q')€Q d x+ej . 

Runtime: d 3 poly(logn, log 1/e) to achieve aceuracy e. 
Procedure: 

1. l£d = l 

2. Then output \j) := \i) = \l) and \q') := \q) = \1) (i.e. do nothing). 

3. Else 

4. Unpack \q) into \^')\q(d-2)), such that fi' G Id,m, m < n, fjf ^ /i and \q(d-2)) G Q^" 1 - 

5. lii<d 

6. Then perform the CG transform with inputs (d— 1, m, |/x'), |<j , (d_2)), and outputs 

mA3')M{d-2)))- 

7. Else (if i = d) 

8. Replace \i) = \d) with |j') := |0) and set \q'( d _ 2 )) \l\d-2))- 

9. End. (Now i G {1, . . . , d} has been replaced by j G {0, . . . , d — 1}.) 

10. Map |//)|j') to |/i' + e/ )|/). 

-^-^ Conditioned on [i and /i' + e^, calculate the gate sequence necessary to implement T^ d \ 
which inputs \j') and outputs \j). 

12. Execute this gate sequence, implementing T^l. 

13. Undo the computation from 11. 
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14. Combino + e.y) and \q'( d _ 2 )) to form \q'). 

15. End. 

Finally, in Section 7.2 we described how n CG transforms can be used to perform thc Schur 
transform, so that Í7s c h can be implemented in time n ■ poly(oí, logn, log 1/e), optionally plus an 
additional poly(n) time to compress the \p) register. 
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Chapter 8 



Relations 
transform 



between the Schur 



and the S n QFT 



This final chapter is devoted to algorithmic connections between the Schur transform and the quantum 
Fourier transform on S n . In Section 8.1 we describe generalized phase estimation, which is a reduction 
from measuring in the Schur basis (a weaker problem than the full Schur transform) to the iS„ QFT. 
Then in Section 8.2 we show a reduction in the other direction, from the S n QFT to the Schur 
transform. The goal of these reductions is not so much to perform new tasks efhciently, since efficicnt 
implementations of the QFT already exist, but to help clarify the position of the Schur transform 
vis-a-vis known algorithms. 

8.1 Generalized phase estimation 

The last chapter developed the Schur transform based on the Ud CG transform. Can we instead 
build the Schur transform out of operations on S n l This section explores that possibility. We will see 
that using the S n QFT allows us to efhciently measure a statc in the Schur basis, a slightly weaker 
task than performing the full Schur transform. Our algorithm for this measurement generalizes the 
quantum circuits used to estimate the phase of a black-box unitary transform[Sho94, KSV02] (see 
also [KR03]) to a nonabelian setting; hence we call it generalized phase estimation (GPE). As we will 
see, our techniques actually extend to measuring the irrep labels in reducible representations of any 
group for which we can efhciently perform group operations and a quantum Fourier transform. 

The main idea behind GPE is presented in Section 8.1.1, where we show how it can be used to 
measure |A) (and optionally \p) as well) in the Schur basis. Here the techniques are complctcly general 
and we show how similar results hold for any group. We specialize to Schur basis measurements in 
Section 8.1.2, where we show that GPE can be extended to also measure the Qf register, thereby 
making a complete Schur basis measurement possible based only on the S n QFT. We conclude in 
Section 8.1.3 with an alternate interpretation of GPE, which shows its close connection with the S n 
CG transform. 

8.1.1 Using GPE to measure À and V\ 

Let G be an arbitrary finite group over which there exists an efficicnt circuit for the quantum Fourier 
transform [MRR04], C/qft- Fix a set of inequivalent irreps G, where [i G G corresponds to the irrep 
(r^jV^). £/qft then maps the group àlgebra C[G] to (ig g Vj, ® V*, and is explicitly given by 



Now suppose (p, V) is a representation of G for which we can efhciently perform the controlled-p 
operation, C p = X^geG IsXffl ® P(d)- To specialize to the Schur transform we will choose V = (C d )® n 



Eq. (5.9). 
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and p = P, but everything in the section can be understood in terms of arbitrary G and (p, V). Let 
the multiplicity of the irrep V v in V be given by m v , so that V decomposes as 



V 



( 

i/EG 



(8.1) 



This induces a basis for V, analogous to the Schur basis, given by \v, oc, k)y 7 where v G G, a G [m„] 
and fc G [di/], where d v :— diniV^. For any A G G. dcfine the projector onto the V\-isotypic subspacc 
in terms of this basis as 

n A = |A)(A|®/ ro , ®I dx . (8.2) 

Notc that this bccomcs Eq. (6.18) for the special casc of (p, V) = (P, (C d )® n ). 

The problem is that, as with the Schur basis, thcre is no immcdiatcly obvious way to measure or 
othcrwise access the register labeling the irreps. We are given no information about the isomorphism 
in Eq. (8.1) or about how to implcmcnt it. However, by using the Fourier transform along with the 
controlled-p operator, it is possible to efhciently perform the projective measurement {II a } A6( a,. To 
do so, we dcfine the operator 



acting on © Vp 



C P = (C/qft ® Iy^piU 1 ^ ® I v ) 
V . This is represented in Fig. 8-1. 



(8.3) 



Im> — 



li> — 



u, 



QFT 



\i>) 



Uqft 



p{g) 



Figure 8-1: Quantum circuit C p used in generalized phase estimation. 



The procedure for performing the projective measurement {II,\} Ae( A, is as follows: 

Algorithm: Generalized Phase Estimation 
Inputs: A state \ip) G V. 

Outputs: (1) Classical variable A with probability p\ := (ip\IL\\ip) 

(2) The state n x \i>) / y/px. 
Runtime: 2Tqft + Tn where Tqft (resp. Tc p ) is the running time for the QFT on G (resp. 

controllcd-p operation). 
Procedure: 

1. Create registers |aí)|í)|j) (see Fig. 8-1) with p corresponding to the trivial representation Vq 
and |i) = |i) = |1) G Vq. 

2. Apply C p . This involves three steps. 

a) Apply í7q FT to \p)\i)\j), obtaining the uniform superposition | C | 1 / 2 J2 geG !<?)• 

b) Perform C p = £ 9 \g){g\® p{g). 

c) Apply í/qft to the first register. 

The output IV'out) is a superposition of |A)|í')|j')|í;) with A G G, G V\, \f) G V£ and 
\v) G V. 

3. Measure A. 

4. Optionally perform CL This is only necessary if we need the residual state H\\ i/j). 
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To analyze this circuit, expand \ip) in the |/x)|a)|fc)v basis as 

= X) £É c /·.a,*lA*ia > *>v· (8-4) 

m6( Jq = 1 fe=l 

Eq. (8.1) means that p(g) acts on according to 



a=l k=l 



Now examine the C[G] register. The initial Uq FT in C p maps the trivial irrep to the uniform 
superposition of group elements yjjy X^geG This is analogous to the initialization step of phase 
estimation on abelian groups[Sho94, KSV02]. Thus the output of the circuit in Fig. 8-1 is 

IU = EEE.É ^[r A ( 5 )]..|A,i,i)®p( 5 )|# (8.6) 

We can simplify this using Eq. (8.5) and the orthogonality relations for irrep matrix elements [GW98] 
to reexpress Eq. (8.6) as 

l*o«t> = £EE ^\\,i,j)®\\,aj) v . (8.7) 

The output |<^out) has sevcral intcrcsting propcrtics which wc can now cxploit. Measuring the first 
register (the irrep label index) produces outcome A with the correct probability Y^j^i Sfe=i l c A.j,fc| 2 - 
Remarkably, this is achieved independent of the basis in which C p is implemented. As mentioned 
above, this reduces to measuring the irrep label A in the Schur basis when G — S n and (p, V) = 
(P, (C d )® n ). In this case, the circuit requires running time poly(n) for the S n QFT[Bea97] and 
0(n log d) time for the controlled permutation Cp , comparable to the efficiency of the Schur transform 
given in the last chapter. 

This circuit also allows us to perform arbitrary instruments on the irrep spaces V\] for examplc, 
we could perform a complete measurement, or could perform a unitary rotation conditioncd on A. 
This is because Eq. (8.7) has extracted the irrep basis vector from \tp) into the \i) register. We can 
perform an arbitrary instrument on this V\ register, and then return the information to the V register 
by performing 

To put this more formally, suppose we want to perform an instrument with operation clements 



^ |A)(A| ®I mx ® A 



(x) 
X 



A6G 



on V, where x labels the outeomes of the instrument and the normalization condition is that 
J2x( A \ X) ) ÍA x X) = W for cach A. Then this can be effected by performing the instrument 

\\)(\\ ® ® I dx ® I v }Ó P . (8.9) 




This claim can be verified by explicit calculation and use of the orthogonality relations, but we will 
give a simpler proof in Section 8.1.3. 
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To recap, so far we have shown how GPE can be used to efficiently: 

• measure the |A) and \p) registers, or perform general instruments of the form of Eq. (8.8), in 
the Schur basis of (C d )®" using poly(n) + 0{n\ogd) gates; and 

• perform instruments of the form of Eq. (8.8) for any group G and representation (p, V) such 
that the QFT on G and the controlled-p operation can be implemented efficiently. 



8.1.2 Using GPE to measure Qf 

In this section we specialize to the case of the Schur basis and show how GPE can be adapted to 
measure the \q) register. This allows us to perform a complete measurement in the |A)|p)|g)s c h basis, 
or more generally, to perform instruments with operation elements 

E E ^ch (|A)(A| ® \ q )( q \ ® A^) Ul h . (8.10) 

AGX d ,„ q eQd 

Here Qf is the GZ basis defined in Section 7.1.2. We will find that the running time is 
dpoly(n, logcZ, log 1/e), which is comparable to the running time of the circuits in Section 7.2, but 
has slightly less dependence on d and slightly more dependence on n.* More importantly, it gives a 
conceptually independent method for a Schur basis measurement. 

The main idea is that we can measure \q) G by measuring the irrep label q c for each subgroup 
IÀ C C Ud, c = 1, . . . , d — 1. We can measure q c by pcrforming GPE in a way that only looks at regis- 
ters in states |1), . . . , |c). As these measurements commute[Bie63] — in fact, they are simultancously 
diagonalized by the GZ basis [GZ50] — we can perform them sequentially without worrying about the 
disturbance that they cause. After performing this modified GPE d — 1 times, we can extract the 
register \q) in addition to the |A)|p) that we get from the first application of GPE. 

We now describe this modification of GPE in more detail. To do so, we will need to considcr 
performing GPE on a variable number of qubits. Define U^p^ by 

0&S = E W®(^a B) ) t (lWI®/ a -®/7»x)ü& B) = E |A>®nf"> (8.ii) 

Aex dj „ Aex d ,„ 

This cohcrcntly extracts the |A) register from (C d )®". Here we have also explicitly written out the 
dependence of YT^' n ' and í/gÍ on d and n. Also, we have expressed Uq P ^ as an isometry to avoid 
writing out the ancilla qubits initializcd to zero, but it is of course a reversible unitary transform. 

For example, we use GPE to construct ri^ d by performing J/gpE :> measuring A and then undoing 

TT {d,n)_ 

U GPE ■ 

nf") = ([/^eV (|A)(A| ® IT) U&g. (8.12) 

The observables we want to measure correspond to determining the irrep label of IÀ C . For any 
W c -representation (q, Q), we define the Q^-isotypic subspace of Q to be the direct sum of the irreps 
in the decomposition of Q that are isomorphic to Q° (cf. Eq. (5.2)). The projector onto this subspace 



*If d is much larger than n, then it is always possible (even with the CG-based Schur transform) to reduce the time 
for a Schur basis measurement to poly(n, log 1/e) + O (n log d) . This is because, given a string |ii,...,í n ) G (C d )® n , 
we can first measure the type in time 0(n log d) and then unitarily map . . . , i n ) to |í^, . . . , i' n ), where i'- G [n] and 
ij = i' k iff ij = i k . Measuring . . . , i' n ) in the Schur basis then requires poly(n, log time, and the measured value 
of \q) can be translated to the proper value by replacing each instance ij in the Young tableau with ij. Moreover, the 
final answer can be used to uncompute the type, so this modification also works when implementing C/sch rather than 
simply a Schur basis measurement. 
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can be given explicitly (though we will not need the exact formula) in terms of q as follows: 

7r£(q)=dimQ£ / dU (Trt^(U))\(ü), (8.13) 

where dU is a Haar measure on U c . Define the W c -representation (Q2,(C d )®") by Q^(C/) = (U © 
Id-c)® n , where (U ffiid-c) is the cmbcdding of U c in % given by Eq. (7.3). Our goal is to perform the 
projective measurements {^(Q^)} ,j,ez a ,|m|<« for c = 1, . . . , cf. Since for each c, 7T^(Q^) is diagonal 
in the GZ basis Q^, the projectors commute and can be measured simultaneously. 

For the special case of |A| = m = n and c = d, QJJ is the same as the Q defined in Eq. (5.11) and 
we have = 7r^(Q2). In this case, Eq. (8.12) telis us how to perform the projective measurement 

{^íiQd)} ï-tid n - We now nee d to extend this to measure {^(Q^)} ([i ei eim , TO <n for any ce [d]. 

The first step in doing so is to measure the number of positions in . . . , i n ) where ij € {1, . . . , c}. 
Call this number m. Though we will measure m, we will not identify which j have ij € [c]. Instead 
we will coherently separate them by performing the unitary operation which implements the 
isomorphism: 

n 

(C d )®" = (C c )® m ® (c d - c )®"-" 1 ® C&) (8.14) 

It is straightforward to implement Z7 S ^ in time linear in the size of the input (i.e. 0(n\ogd)), though 
we will have to pad quantum registers as in the original Schur transform. 

We can usc to construct 7r^(QJJ) in terms of 7r£(Q™) as follows: 

*£(Q3) = (^sciV (l™>H ® ^(QD ® If- C ~ m ® /(»)) C^. (8-15) 
If m = \fj,\, then we can use Eq. (8.12) to construct the projection on the RHS of Eq. (8.15), obtaining 

*£(Q3) = (u^Y í|m)H® 

This gives a prescription for measuring 7r^(Q^)i whose output \x corresponds to the component q c 
of the GZ basis element. First measure m = \fj,\ by counting the number of qudits that have vàlues 
in {1, . . . , c}. Then select only those m qudits using U^} and perform GPE on them to find fi. 

Finally, measuring the commuting observables {^(Q^)} /xgz^ + for c = 1, . . . , d yields a completo 
von Neumann measurement of the Qf register with operation elements as follows: 

(u£c ] y (iaxai ® \ q )( q \ ® /pj ^ = n ( 8 - i? ) 

c=l 

with g ranging over Qf. 

Combined with the results of the last section, we now have an algorithm for performing a completo 
measurement in the Schur basis, or more generally an arbitrary instrument with operation elements 

(^c h ,l) ) f ( E i a x a i ® iaxai ® 43 ] tf& n) , («-is) 

where the are arbitrary operators on Pa an d x labels the measurement outeomes. In particular, if 

we let x range over triples (A, q, p) with A G q £ Q\ and p <E P\; and set -A^'*, = <5a,a"5ç, ç ' IpXpI , 
then Eq. (8.18) corresponds to a complete von Neumann measurement in the Schur basis. The general 



(c,m) 
GPE 



l Jür n ®i ( - ) )E& ) 



(8.16) 
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algorithm is as follows: 



Algorithm: Complete Schur basis measurement using GPE 
Input: A state G (C d )®". 

Output: E Ae i d ,„ E q e Qi E* \m\xM d c k n) y (Ea 6 t„. b M*! ® |<?X?I ® 41) ^ch"V) 

corresponding to the coherent output of the instrument in Eq. (8.18) 
Runtime: d ■ 0(Tqft(<Stj) + nlogd) 
Procedure: 

1. For c = 1, . . . ,d- 1: 

2. Apply í7 s ( e C i t0 \Í>) > outputting superpositions of |m) |a™) |/3jl") |7>(»y 

3. Perform £^ =0 1*1 ® 4pb ) on |m)|0 to output |m) E Mce zJ KK. (Q3)K 

4. Apply (U { J)i. 

5. Set |ç) := . . . i). 
(Stcps 6-10 are based on Eq. (8.9).) 



6. Add registers |/i) = |(n)) and |i) = |i)= I _ I _ I • • • I n \ ) corresponding to the trivial irrep of S n . 

7. Perform G P (:= (Í7 QFT ® /)C P (£/J FT ® /)) on |íi>|*}|j)|^) to output 

8. Perform thc instrument {Easi^ „ T. q eQ d x WYfl\ ® l A X A l ® 41 ® ^ ® ^f™} ■ 

9. Apply G P to output MIOIW)- 

10. The registers are always in the state \(n)) 

11. Reverse steps 1-5. 



and can bc discarded. 



Generalizations to other groups: Crucial to this procedure is not only that Ud and S n form a 
dual reductive pair (cf. Section 5.4), but that both groups have GZ bases, and their canonical towers 
of subgroups also form dual reductive pairs. Thcsc conditions ccrtainly exist for other groups (e.g. 
Ud x x Ud 2 acting on polynomials of <C dl+d2 ), but it is an open problem to find useful applications of 
thc resulting algorithms. 

8.1.3 Connection to the Clebsch-Gordan transform 

In this section, we explain how GPE can be thought of in terms of the CG transform on G, or on S n 
when we specialize to the case of the Schur basis. The goal is to give a simple representation-theoretic 
interpretation of the measurements described in Section 8.1.1 as wcll as pointing out relations between 
thc QFT and the CG transform. 

We begin with a quick review of GPE. We have two registers, C[G] and V, where (p, V) is a 
representation of G. Assumc for simplicity that the isomorphism in Eq. (8.1) is an cquality: 

K=0K,®C m ». (8.19) 

This means that the controlled-p operation C p is given by 

C P = J2 \9)<9\ ® PÍ3) = E 1-9) (-91 ® E ® *M ® W (8.20) 

SSG gdG u( z G 

The GPE prescription given in Section 8.1.1 began with initializing C[G] with the trivial irrep (or 
equivalently a uniform superposition over group elements). We will relax this condition and analyze 
the effects of G p (the Fourier-transformed version of G p , cf. Eq. (8.3)) on arbitrary initial states in 
C[G] in order to see how it acts like the CG transform. 
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There are two equivalent ways of understanding how C p acts like the CG transform; in terms of 
representation spaces or in terms of representation matrices. We first present the explanation based 
on representation matrices. Recali from Section 5.2.3 that C[G] can be acted on by either the lcft or 
the right representation (L(h)\g) = \hg) and H(h)\g) = \gti~ 1 )). The controlled-p operation acts on 
these matrices as follows 

C p (L(g)®I v )Cl = L(g)®p(g) (8.21) 
Cl (TL(g) ® Iv) C p = R(g) ® p(g) (8.22) 

The proofs of these claims are straightforward*. If we combino them, wc find that 

C p (L( 9l )R(g 2 ) ® p(g 2 ))Cl = L( 9l )R(g 2 ) ® p( 9l ). (8.25) 

Let us examine how C p transforms the left and right representations, any observable on C[G] can 
be constructed out of them. Focus for now on the action of C p on the left representation. Eq. (8.21) 
says that conjugation by G p maps the left action on C[G] to the tensor product action on C[G] ® V. 
To see how this acts on the representation spaces, we conjugate each operator by C/qft, replacing C p 
with G p , L with L and R with R. Then C p couples an irrep Vp from C[G] = @^ Vp ® V* with an 
irrep V v from V and turns this into a sum of irreps V\. This explains how GPE can decompose V 
into irreps: if initialize p to be the trivial irrep, then only A = v appears in the output and measuring 
A has the effect of measuring the irrep label of V. The right representation is acted on by C p in 
the opposite manner; we will see that conjugating by G p corresponds to the inverse CG transform, 
mapping V* in the input to V£ ® V v in the output. 

To makc this concrete, Fourier transform each term in Eq. (8.25) to obtain 

C l S ® r M ® r ^2 1 ) ® \"M ® r v { gi ) ® I mv C p (8.26) 

= (C/qft ® Iv) Cl (L(. 9l )R(. 92 ) ® p( 9l )) C p (V^ FT ® Iv) (8.27) 
= (í/qft ® Iv) (L(ffi)R(fl2) ® p(ga)) (^qft ® ív) (8-28) 
= l A X A l ® rA (fi) ® rA(.9 2 _1 ) ® X I^X^I r ^ 2 ) ® i"™, (8.29) 

AeG i/6G 

To understand this we need to work backwards. Measuring an observable on the V\ register of 
the final state corresponds to measuring that observable on the Wisotypic subspace of the original 
Vp®V v inputs. On the other hand, the initial V* register splits into V£ and V v registers. We can see 
an example of this in Eq. (8.7), where the Vp register (\i)) has been transferred to V\, while V£ and 



*Here we prové Eqns. (8.21) and (8.22): 



C p (L(h) ® I v ) C\ = Y \hgiWigí\®p{h gí )\ Y \ h 92){92\ ®lv) Y IssXsal ®p(</ 3 ') 

= Y \ h 9)(9\ ® P{h) = L(/i) ® p(/i) (8.23) 

Ct(R(/ l )®/ v )Cp = ( £ Isi^Xsi^ 1 !®/^/"?!" 1 ))! Y \92h- 1 )(g2\<SW)\ Y líaXss|®p(fls)) 
\BieG /\92eG /\9 3 eG / 

= 53 IS^Xal^PC 1 ) = R W ® PÍ h ) ( 8 - 24 ) 

9SG 
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V v are in thc maximally entangled state \&\) = d^^ 2 'X^ii |Ü) corresponding to the trivial irrep 
that V* was initialized to. 

Thus C p corresponds to a CG transform from V p (8> V u to V\ and an inverse CG transform from 
V* to V A * <g> V v . These maps are sketched in Fig. 8-2. 




Figurc 8-2: Performing C p combines and V v to form V\ and splits V* into V£ and V v . Here 
V/j, <X> V* and Va ® come from the decomposition of C[G] and V v comes from the decomposition of 
V. 



We can verify the two maps separately by rcplacing C p in Eq. (8.26) with Uqg or C^cg ac ting 
on the appropriatc rcgisters and checking that the representation matrices transform appropriately. 
There are two details here which still necd to be explained. First, our dcscription of the CG transform 
has not accountcd for the multiplicity spaces that are gencrated. Second, wc have not explained how 
the inverse CG transform always creates the correct irreps V A * (g> V„ when A is a labcl output by the 
first CG transform. To explain both of these, we track the representation spaces through a series of 

transformations equivalent to C p . As with G p , we will begin and end with C[G] ® V, but we will 

u 

show how the component irreps transform along the way Here, = is used to mean that the unitary 
operation U implements the isomorphism; all of the isomorphisms respect the action of the group G. 



<C[G]®V "iT 0G M ®G; ® 0G„®C m " ] (8.30) 

U - Ca ® Hom(GA, G p ® G V ) G ® G* ® C m " (8.31) 
= Gx®Rom(G* p ,G* x ®G„) G ^G^C™» (8.32) 

= Ga ® G* x <g> G v <g> C m " (8.33) 

f.AEG 

c/t T 

sT C[G] ®0G„® C m " = C[G] (8) V (8.34) 
veó 

The isomorphism in Eq. (8.32) is based on repeated application of the identity Hom(A, B) = A* <£> 
B. This equivalence between Hora{G\ 1 G p ® G U ) G and Hom(G*,G A ® G U ) G is the reason that 
a CG transform followed by an inverse CG transform on different registers can yield the correct 
representations in thc output. 
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Application: using í/qft to construct Ucg 

So far the discussion in this section has been rather abstract: we have shown that C p acts in a way 
analogous to í/cGj but have not given any precise statement of a connection. To give the ideas in this 
section opcrational meaning, we now show how C/qft can be used to perform Ucg on an arbitrary 
group. This idea is probably widely known, and has been used for the dihcdral group in [Kup03], but 
a presentation of this form has not appeared before in the literature. 

The algorithm for Ucg is depicted in Fig. 8-3 and is described as follows: 

Algorithm: Clebsch-Gordan transform using GPE 

Input: |/i) j4l K>" la l ,/ > Bl K> Ba j where frUEG, \v„) G V„ and \v v ) G V v . 

Output: \\) A ^\v x ) A *\v) B ^\a) c with A G G, \v\) G V\ the irrep of the combined space and \a) G 

(F p g> V v g> V A *) G the multiplicity label. 
Runtime: 4Tqft + Xc L where Tc L is the time of the controlled-L operation. 
Procedure: 

1. Add states \&u) A:iA4, and |w*) B3 , where is the unique state (up to phase) in the one- 
dimensional space (V* <g> V^) G (cf. Eq. (6.33)) and \v*) G V* is arbitrary. 

2. Perform the inverse QFT on AiA 2 A 3 (yielding output A) and on B\B 2 B 3 (yielding output 
B); i.e. 

U QFT w ^QFT 

Registers A and B now contain states in C[G]. 

3. Apply C AB , mapping \ gi ) A \g 2 ) B to |ffi) A |<?i.9 2 ) B 

4. Perform the QFT on A and £?, yielding output AxA 2 A 3 and B 1 B 2 B 3 . 

5. Discard the register B3, which still contains the state \v*). 

6. A\ now contains the combined irrep label, which we call A. The irrep space V\ is in A 2 , 
whilc the multiplicity space (V^ <%> V v <%> V£) G is in A^B 2 A 3 , which we relabel as C. 



V* 
II 



I") 



U QFT 






Í/QFT 







QFT 






Uqft 


- Hg) — 



v v 



Figure 8-3: Using C/qft to construct Ucg f° r & n arbitrary group. The inputs to the CG transform 
are put in the V^, \v) and V v registers. The V* register is not affected by the circuit, but is 
included so that the QFT will have vàlid inputs. The output of the CG transform is in the |A) and 
V\ registers. \v) saves the irrep label of one of the inputs and (V^ ®V V ® V£) G is the multiplicity 
space. 



The representations are transformed in the same way as in Fig. 8-2 with the addition of V* which 
is left unchanged. This is because we only act on the second register using left multiplication, which 
acts as the identity on V* . 
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Thus, we can pcrform thc CG transform efficiently whenever we can efficiently perform the QFT, 
with the small caveat that efficiently manipulating the multiplicity space may take additional effort. 
One application of this reduction is to choose G — Ud and thereby replace the Ud CG construction of 
Section 7.3 with a CG transform based on the Ud QFT. Unfortunately, no fast quantum algorithms 
are known for the Ud and is is not immcdiately clear how quantizing C[%] should correspond to 
cutting off the irreps that appear in the decomposition @ A Qf <g> {Q'l)* ■ However, QFTs are known 
for discrete matrix groups such as GL n (¥ q ) (running in time g°(™· ) )[MRR04] and classical fast Fourier 
transforms are known for Ud and other compact groups [MR97]. 

A problem related to the Ud QFT was also addressed in [Zal04], which sketched an algorithm 
for implementing qí\(U) in time polylogarithmic in |A|, though some crucial details about efficiently 
integrating Legendre functions remain to be established. In contrast, using the Schur transform to 
construct q^(U) would require poly(|A|) gates. The idea behind [Zal04] is to embed Q\ in Qf := 
®fi Q% w bere \i £ Z^. + and \fi\ can be exponentially large. Then Q^ corresponds to functions on 
the 2-sphere which can be discretized and efficiently rotated, though some creative techniques are 
necessary to perform this unitarily. It is possible that this approach could ultimatcly yicld efficicnt 
implcmcntations of Uqg & n d E/qft on Ud- 



8.2 Deriving the S n QFT from the Schur transform 

We conclude the chapter by showing how [/sch can be used to construct í/qft- Of course, an efficicnt 
algorithm for í/qft already exists[Bea97], but the circuit we present here appears to be quite different. 
Some of the mathematical principies behind this connection are in Thm. 9.2.8 of [GW98] and I am 
grateful to Nolan Wallach for a very helpful conversaton on this subject. 

The algorithm is based on the embedding of S n in [n] n given by s — ► (s(l), . . . , s(n)). This induces 
a map from C[«S„] — ► (C n )®™. More precisely, if 1™ denotes the weight (1, . . . , 1) with n ones, then 
we have a unitary map between C[5„] and (C™)® n (l™). This is the natural way we would represent a 
permutation on a computer (quantum or classical): as a string of n distinct numbers from {1, . . . , n}. 
Similarly, we can embed S n in U n by letting a permutation s denote the unitary matrix l s (*))(*l- 

Using this embedding, the algorithm for í/qft is as follows: 

Algorithm: S n QFT using the Schur transform 
Input: C[S n ] 
Output: ® XeXn V* x ®Vx- 
Runtime: poly(n, log 1/e). 
Procedure: 

1. Embed C[S n ] in (C n )® n (l"). 

2. Perform on (C")®™(1™) to output \X)\q)\p). 

3. Output |A) as the irrep label, \q) as the state of V x and \p) for V\. 

First we need to argue that setting |g) to be the V\ output is well-defined. Note that \q) £ Q™(1"), 
so if \q) is a GZ basis vector, then its branching pattern (gi, . . . , q n ) satisfies ç, £ qi+\ — □, and thus 
|çi, . . .,q n ) £ P\. 

Now to prové that this algorithm indeed performs a Fourier transform on S n , we examine a series 
of isomorphisms. Thc Fourier transform relates C[5 n ] to (J) A V\ ® V\. Since weights are determined 
by the action of the unitary group, restricting Eq. (5.16) on both sides to the 1™ weight space gives 
the relation 

(C»)®"(i«) U % Sn Ql{l n )®V x (8.35) 
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Thus we have the isomorphisms: 

C\SJ (C n )®™(l") 
(1) 

(2)|í7q F t (3)li7 S oh 

0Aex„ ^ ® ^ ©a £ i„ 2a(1") ® ^ 

Our goal is to understand the isomorphism (4) by examining how the other isomorphisms act 
on representation matrices. First we look at how (1) relates P,Q with L,R. Notc that Q(<S„) and 
P(«S n ) act on (C")® n (l™) according to 

n n n n 

QW®|«ü)) = ®k(«(j'))) and PW0Ki))=0k(T- 1 (i))) (8-36) 

3=1 3=1 3=1 3=1 

And from the defmition of multiplying pcrmutations, L and R act on C[<S n ] according to 

n n n n 

LW® 1*0')) = <S> l 7 ^'))) and R W0l*)) = (g) (.?))>, (8-37) 

3 = 1 3 = 1 3 = 1 3 = 1 

if we write permutations as elements of [n] n . Thus the embedding map relates L and R to Q and P 
respectively. 

This means that for any 7Ti,7T2 € <S n , the isomorphism (4) maps J2\ \ty(M ® Pa (i'l) ® Pa(tt2) to 

X) A | A) <A| ® qS(7ri)| e „ (1B) ® Pa(tt 2 ). This proves that Q%(l n ) =? Pa (cf. Thm 9.2.8 of [GW98]). 

Moreover, it is straightforward to verify that the GZ basis of Q™(l n ) corrcsponds to the same 
chain of partitions that labcls the GZ basis of T'y, onc nccd only look at which weights appear in the 
restriction to <S„_i C U n -\. This establishes that the representation matrices are the same, up to 
an arbitrary phase difference for each basis vector. The existence of this phasc means that we have 
constructed a slightly diffcrent Fourier transform than [Bea97] , and it is an interesting open question 
to calculate this phase difference and determine its significance. 
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